Closed rchenzheng closed 4 years ago
@rchenzheng Did you try setting insecureSSL for metrics charts? If you are using Managed Splunk Cloud, you also need to set up HEC properly. Please refer to documentation on this page: UsetheHTTPEventCollector
I didn't previously but now I get
│ 2020-07-16 20:46:32 +0000 [error]: #0 Failed to scrape resource usage metrics, error=, #<NoMethodError: undefined method `[]' for nil:NilClass> │
│ 2020-07-16 20:46:32 +0000 [error]: #0 suppressed same stacktrace │
│ 2020-07-16 20:48:17 +0000 [error]: #0 Failed to scrape resource usage metrics, error=, #<NoMethodError: undefined method `[]' for nil:NilClass> │
│ 2020-07-16 20:48:17 +0000 [error]: #0 suppressed same stacktrace │
│ 2020-07-16 20:48:32 +0000 [error]: #0 Failed to scrape resource usage metrics, error=, #<NoMethodError: undefined method `[]' for nil:NilClass> │
│ 2020-07-16 20:48:32 +0000 [error]: #0 suppressed same stacktrace │
│ 2020-07-16 20:49:17 +0000 [error]: #0 Failed to scrape resource usage metrics, error=, #<NoMethodError: undefined method `[]' for nil:NilClass> │
│ 2020-07-16 20:49:17 +0000 [error]: #0 suppressed same stacktrace │
│ 2020-07-16 20:49:32 +0000 [error]: #0 Failed to scrape resource usage metrics, error=, #<NoMethodError: undefined method `[]' for nil:NilClass> │
│ 2020-07-16 20:49:32 +0000 [error]: #0 suppressed same stacktrace │
│ 2020-07-16 20:50:17 +0000 [error]: #0 Failed to scrape resource usage metrics, error=, #<NoMethodError: undefined method `[]' for nil:NilClass> │
│ 2020-07-16 20:50:17 +0000 [error]: #0 suppressed same stacktrace │
│ 2020-07-16 20:50:32 +0000 [error]: #0 Failed to scrape resource usage metrics, error=, #<NoMethodError: undefined method `[]' for nil:NilClass> │
│ 2020-07-16 20:50:32 +0000 [error]: #0 suppressed same stacktrace │
│ 2020-07-16 20:51:17 +0000 [error]: #0 Failed to scrape resource usage metrics, error=, #<NoMethodError: undefined method `[]' for nil:NilClass> │
│ 2020-07-16 20:51:17 +0000 [error]: #0 suppressed same stacktrace │
│ 2020-07-16 20:51:32 +0000 [error]: #0 Failed to scrape resource usage metrics, error=, #<NoMethodError: undefined method `[]' for nil:NilClass> │
│ 2020-07-16 20:51:32 +0000 [error]: #0 suppressed same stacktrace │
│
That was on the agregator vm but still getting the same @mwang2016
│ 2020-07-16 21:02:31 +0000 [error]: #0 Unexpected error raised. Stopping the timer. title=:cadvisor_metric_scraper error_class=RestClient::SSLCertificateNotVerified error="SSL_connect returned=1 errno= │
│ 2020-07-16 21:02:31 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:776:in `rescue in transmit' │
│ 2020-07-16 21:02:31 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:651:in `transmit' │
│ 2020-07-16 21:02:31 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:163:in `execute' │
│ 2020-07-16 21:02:31 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:63:in `execute' │
│ 2020-07-16 21:02:31 +0000 [error]: #0 /fluentd/plugins/in_kubernetes_metrics.rb:660:in `scrape_cadvisor_metrics' │
│ 2020-07-16 21:02:31 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/timer.rb:80:in `on_timer' │
│ 2020-07-16 21:02:31 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run_once' │
│ 2020-07-16 21:02:31 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run' │
│ 2020-07-16 21:02:31 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/event_loop.rb:93:in `block in start' │
│ 2020-07-16 21:02:31 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create' │
│ 2020-07-16 21:02:31 +0000 [error]: #0 Timer detached. title=:cadvisor_metric_scraper
@rchenzheng can you show your config for the metrics?
@rchenzheng can you show your config for the metrics?
After more research, we're using ssl/https with Splunk cloud and the config is as follows:
global:
logLevel: info
splunk:
hec:
indexName: k8s-logs
host: my-host.splunkcloud.com
port: 443
token: HEIC-TOKEN
protocol: https
insecureSSL: false
splunk-kubernetes-metrics:
enabled: true
I've tried both insecureSSL true and false, they still have the same issue
@rchenzheng can you try this config?
splunk-kubernetes-metrics:
kubernetes:
insecureSSL: false
splunk:
hec:
indexName: k8s-logs
host: my-host.splunkcloud.com
port: 443
token: HEIC-TOKEN
protocol: https
insecureSSL: false
@rchenzheng can you try this config?
splunk-kubernetes-metrics: kubernetes: insecureSSL: false splunk: hec: indexName: k8s-logs host: my-host.splunkcloud.com port: 443 token: HEIC-TOKEN protocol: https insecureSSL: false
Same error, no changes to my config either
@rchenzheng Sorry the insecureSSL
should be true
and if you are using Managed Splunk Cloud your host
should have prefix like https://http-inputs-my-host.splunkcloud.com:443
Send data to HTTP Event Collector on Splunk Cloud instances
Depending on the type of Splunk Cloud that you use, you must send data using a specific URI for HEC.
The standard form for the HEC URI in self-service Splunk Cloud is as follows:
<protocol>://input-<host>:<port>/<endpoint>
The standard form for the HEC URI in managed Splunk Cloud is as follows:
<protocol>://http-inputs-<host>:<port>/<endpoint>
@rchenzheng Sorry the
insecureSSL
should betrue
and if you are using Managed Splunk Cloud yourhost
should have prefix likehttps://http-inputs-my-host.splunkcloud.com:443
Send data to HTTP Event Collector on Splunk Cloud instances Depending on the type of Splunk Cloud that you use, you must send data using a specific URI for HEC. The standard form for the HEC URI in self-service Splunk Cloud is as follows: <protocol>://input-<host>:<port>/<endpoint> The standard form for the HEC URI in managed Splunk Cloud is as follows: <protocol>://http-inputs-<host>:<port>/<endpoint>
Even with it off or insecureSSL: true I still have the same issue.
Wouldn't this allow for man-in-the-middle attacks since insecureSSL
is for self-signed certificates?
│ 2020-07-17 20:17:21 +0000 [error]: #0 Timer detached. title=:metric_scraper │
│ 2020-07-17 20:17:21 +0000 [error]: #0 Unexpected error raised. Stopping the timer. title=:stats_metric_scraper error_class=RestClient::SSLCertificateNotVerified error="SSL_connect returned=1 errno=0 s │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:776:in `rescue in transmit' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:651:in `transmit' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:163:in `execute' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:63:in `execute' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /fluentd/plugins/in_kubernetes_metrics.rb:647:in `scrape_stats_metrics' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/timer.rb:80:in `on_timer' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run_once' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/event_loop.rb:93:in `block in start' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 Timer detached. title=:stats_metric_scraper │
│ 2020-07-17 20:17:21 +0000 [error]: #0 Unexpected error raised. Stopping the timer. title=:cadvisor_metric_scraper error_class=RestClient::SSLCertificateNotVerified error="SSL_connect returned=1 errno= │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:776:in `rescue in transmit' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:651:in `transmit' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:163:in `execute' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:63:in `execute' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /fluentd/plugins/in_kubernetes_metrics.rb:660:in `scrape_cadvisor_metrics' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/timer.rb:80:in `on_timer' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run_once' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/event_loop.rb:93:in `block in start' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 Timer detached. title=:cadvisor_metric_scraper
@rchenzheng did you add the prefix http-inputs-
in your host
?
│ 2020-07-17 20:17:21 +0000 [error]: #0 Unexpected error raised. Stopping the timer. title=:stats_metric_scraper error_class=RestClient::SSLCertificateNotVerified error="SSL_connect returned=1 errno=0 s │
this log seems to say it cannot get metrics from k8s due to ssl issue.
if you have kubernetes.insecureSSL
to true, then it will not have error.
splunk-kubernetes-metrics:
kubernetes:
insecureSSL: true
but are you still having this error?
│ 2020-07-16 20:46:32 +0000 [error]: #0 Failed to scrape resource usage metrics, error=, #<NoMethodError: undefined method `[]' for nil:NilClass> │
│ 2020-07-16 20:46:32 +0000 [error]: #0 suppressed same stacktrace
and could you post here pod logs from the beginning till the error you are getting? (with sensitive info removed)
@rchenzheng did you add the prefix
http-inputs-
in yourhost
?
Yes this was added
│ 2020-07-17 20:17:21 +0000 [error]: #0 Unexpected error raised. Stopping the timer. title=:stats_metric_scraper error_class=RestClient::SSLCertificateNotVerified error="SSL_connect returned=1 errno=0 s │
this log seems to say it cannot get metrics from k8s due to ssl issue. if you have
kubernetes.insecureSSL
to true, then it will not have error.splunk-kubernetes-metrics: kubernetes: insecureSSL: true
but are you still having this error?
│ 2020-07-16 20:46:32 +0000 [error]: #0 Failed to scrape resource usage metrics, error=, #<NoMethodError: undefined method `[]' for nil:NilClass> │ │ 2020-07-16 20:46:32 +0000 [error]: #0 suppressed same stacktrace
and could you post here pod logs from the beginning till the error you are getting? (with sensitive info removed)
I set that to true yet the error is still there, but the case is that my input isn't a self-signed certificate and this would allow a MITM attack
│ @type splunk_hec │
│ data_type metric │
│ metric_name_key "metric_name" │
│ metric_value_key "value" │
│ protocol https │
│ hec_host "http-inputs-MYINPUT.splunkcloud.com" │
│ hec_port 443 │
│ hec_token "MY-TOKEN" │
│ host "THE-HOST" │
│ index "MY-INDEX" │
│ source "${tag}" │
│ insecure_ssl true
so which error are you getting?
could you specify what you mean by "my input isn't a self-signed certificate"?
could you specify what you mean by "my input isn't a self-signed certificate"?
The ssl certificate for http-inputs-MYINPUT.splunkcloud.com
is a wildcard certificate for *.splunkcloud.com
I believe the issue here is the container which may or may not trust the ssl certificate, so it's whatever you guys bundled the image with.
so which error are you getting?
│ 2020-07-17 20:17:21 +0000 [error]: #0 Unexpected error raised. Stopping the timer. title=:cadvisor_metric_scraper error_class=RestClient::SSLCertificateNotVerified error="SSL_connect returned=1 errno= │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:776:in `rescue in transmit' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:651:in `transmit' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:163:in `execute' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:63:in `execute' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /fluentd/plugins/in_kubernetes_metrics.rb:660:in `scrape_cadvisor_metrics' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/timer.rb:80:in `on_timer' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run_once' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/event_loop.rb:93:in `block in start' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create' │
│ 2020-07-17 20:17:21 +0000 [error]: #0 Timer detached. title=:cadvisor_metric_scraper
Hello All,
I have similar issue, metric pods are failing, is there solution for this?
@type kubernetes_metrics
tag "kube.*"
node_name "ocp.xyz.com"
use_rest_client_ssl true
cluster_name xyz
interval 15s
is not used.
2020-06-09 13:36:25 +0000 [info]: #0 starting fluentd worker pid=18 ppid=8 worker=0
2020-06-09 13:36:25 +0000 [debug]: #0 buffer started instance=47143694559200 stage_size=0 queue_size=0
2020-06-09 13:36:25 +0000 [info]: #0 fluentd worker is now running worker=0
2020-06-09 13:36:26 +0000 [debug]: #0 flush_thread actually running
2020-06-09 13:36:26 +0000 [debug]: #0 enqueue_thread actually running
2020-06-09 13:36:40 +0000 [error]: #0 Unexpected error raised. Stopping the timer. title=:metric_scraper error_class=RestClient::SSLCertificateNotVerified error="SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed (unable to get local issuer certificate)"
2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/rest-client-2.1.0/lib/restclient/request.rb:776:in rescue in transmit' 2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/rest-client-2.1.0/lib/restclient/request.rb:651:in
transmit'
2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/rest-client-2.1.0/lib/restclient/request.rb:163:in execute' 2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/rest-client-2.1.0/lib/restclient/request.rb:63:in
execute'
2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluent-plugin-kubernetes-metrics-1.1.2/lib/fluent/plugin/in_kubernetes_metrics.rb:635:in scrape_metrics' 2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluentd-1.4.0/lib/fluent/plugin_helper/timer.rb:80:in
on_timer'
2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/cool.io-1.5.4/lib/cool.io/loop.rb:88:in run_once' 2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/cool.io-1.5.4/lib/cool.io/loop.rb:88:in
run'
2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluentd-1.4.0/lib/fluent/plugin_helper/event_loop.rb:93:in block in start' 2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluentd-1.4.0/lib/fluent/plugin_helper/thread.rb:78:in
block in thread_create'
2020-06-09 13:36:40 +0000 [error]: #0 Timer detached. title=:metric_scraper
2020-06-09 13:36:40 +0000 [error]: #0 Unexpected error raised. Stopping the timer. title=:stats_metric_scraper error_class=RestClient::SSLCertificateNotVerified error="SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed (unable to get local issuer certificate)"
2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/rest-client-2.1.0/lib/restclient/request.rb:776:in rescue in transmit' 2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/rest-client-2.1.0/lib/restclient/request.rb:651:in
transmit'
2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/rest-client-2.1.0/lib/restclient/request.rb:163:in execute' 2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/rest-client-2.1.0/lib/restclient/request.rb:63:in
execute'
2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluent-plugin-kubernetes-metrics-1.1.2/lib/fluent/plugin/in_kubernetes_metrics.rb:647:in scrape_stats_metrics' 2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluentd-1.4.0/lib/fluent/plugin_helper/timer.rb:80:in
on_timer'
2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/cool.io-1.5.4/lib/cool.io/loop.rb:88:in run_once' 2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/cool.io-1.5.4/lib/cool.io/loop.rb:88:in
run'
2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluentd-1.4.0/lib/fluent/plugin_helper/event_loop.rb:93:in block in start' 2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluentd-1.4.0/lib/fluent/plugin_helper/thread.rb:78:in
block in thread_create'
2020-06-09 13:36:40 +0000 [error]: #0 Timer detached. title=:stats_metric_scraper
2020-06-09 13:36:40 +0000 [error]: #0 Unexpected error raised. Stopping the timer. title=:cadvisor_metric_scraper error_class=RestClient::SSLCertificateNotVerified error="SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed (unable to get local issuer certificate)"
2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/rest-client-2.1.0/lib/restclient/request.rb:776:in rescue in transmit' 2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/rest-client-2.1.0/lib/restclient/request.rb:651:in
transmit'
2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/rest-client-2.1.0/lib/restclient/request.rb:163:in execute' 2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/rest-client-2.1.0/lib/restclient/request.rb:63:in
execute'
2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluent-plugin-kubernetes-metrics-1.1.2/lib/fluent/plugin/in_kubernetes_metrics.rb:660:in scrape_cadvisor_metrics' 2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluentd-1.4.0/lib/fluent/plugin_helper/timer.rb:80:in
on_timer'
2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/cool.io-1.5.4/lib/cool.io/loop.rb:88:in run_once' 2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/cool.io-1.5.4/lib/cool.io/loop.rb:88:in
run'
2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluentd-1.4.0/lib/fluent/plugin_helper/event_loop.rb:93:in block in start' 2020-06-09 13:36:40 +0000 [error]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluentd-1.4.0/lib/fluent/plugin_helper/thread.rb:78:in
block in thread_create'
2020-06-09 13:36:40 +0000 [error]: #0 Timer detached. title=:cadvisor_metric_scraper
Could you try
kubernetes
# This option is used to get the metrics from summary api on each kubelet using ssl
useRestClientSSL: true
# if insecureSSL is set to true, insecure HTTPS API call is allowed, default false
insecureSSL: true
yes these issues are caused by the metrics chart scraping the kubelet and not using insecureSSL to talk to port 10250. nothing to do with Splunk cloud certs. There are certs in many parts of this solution, so can be confusing...
Kubelet rarely has real certs...have you gotten it to work @rchenzheng ?
Could you try
kubernetes # This option is used to get the metrics from summary api on each kubelet using ssl useRestClientSSL: true # if insecureSSL is set to true, insecure HTTPS API call is allowed, default false insecureSSL: true
Yes i do have use_rest_client_ssl true and insecureSSL true set but still have ssl error, Did this work for you?
Please share your values.yaml or a copy of the running configmap in the cluster. Also what flavour of k8s?
kubectl get cm
kubectl describe cm
Please share your values.yaml or a copy of the running configmap in the cluster. Also what flavour of k8s?
kubectl get cm
kubectl describe cm
By setting splunk-kubernetes-metrics.kubernetes.useRestClientSSL: false
It fixed the ssl issue but now I get:
│ 2020-07-22 12:32:50 +0000 [error]: #0 Timer detached. title=:stats_metric_scraper │
│ 2020-07-22 12:32:50 +0000 [error]: #0 Unexpected error raised. Stopping the timer. title=:cadvisor_metric_scraper error_class=RestClient::BadRequest error="400 Bad Request" │
│ 2020-07-22 12:32:50 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/abstract_response.rb:249:in `exception_with_response' │
│ 2020-07-22 12:32:50 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/abstract_response.rb:129:in `return!' │
│ 2020-07-22 12:32:50 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:836:in `process_result' │
│ 2020-07-22 12:32:50 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:743:in `block in transmit' │
│ 2020-07-22 12:32:50 +0000 [error]: #0 /usr/share/ruby/net/http.rb:910:in `start' │
│ 2020-07-22 12:32:50 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:727:in `transmit' │
│ 2020-07-22 12:32:50 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:163:in `execute' │
│ 2020-07-22 12:32:50 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:63:in `execute' │
│ 2020-07-22 12:32:50 +0000 [error]: #0 /fluentd/plugins/in_kubernetes_metrics.rb:660:in `scrape_cadvisor_metrics' │
│ 2020-07-22 12:32:50 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/timer.rb:80:in `on_timer' │
│ 2020-07-22 12:32:50 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run_once' │
│ 2020-07-22 12:32:50 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run' │
│ 2020-07-22 12:32:50 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/event_loop.rb:93:in `block in start' │
│ 2020-07-22 12:32:50 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create' │
│ 2020-07-22 12:32:50 +0000 [error]: #0 Timer detached. title=:cadvisor_metric_scraper
what is your kubelet --version
?
what is your
kubelet --version
?
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.10", GitCommit:"1bea6c00a7055edef03f1d4bb58b773fa8917f11", GitTreeState:"clean", BuildDate:"2020-02-11T20:05:26Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}
i was able to reproduce it and fix it. i have working image with fix now. rock1017/k8s-metrics:1.1.3-2
feel free to use this image until an official release come out by Splunk. Thank you and let me know if it works for you or not.
btw, I am using kubelet 10255 port for read-only http access.
i was able to reproduce it and fix it. i have working image with fix now.
rock1017/k8s-metrics:1.1.3-2
feel free to use this image until an official release come out by Splunk. Thank you and let me know if it works for you or not. btw, I am using kubelet 10255 port for read-only http access.
Any timelines on when this gets merged?
Thanks
hi @rockb1017 any high level details on root cause and how to identify if someone is impacted?
@rockb1017 , Are there any changes to configmap metrics, here is my template fluent.conf: |
<system>
log_level debug
</system>
<source>
@type kubernetes_metrics
tag kube.*
node_name "#{ENV['NODE_NAME']}"
use_rest_client_ssl true
cluster_name {{ splunk_cluster_id }}
interval 15s
</source>
<filter kube.**>
@type record_modifier
<record>
metric_name ${tag}
cluster_name {{ splunk_cluster_id }}
</record>
</filter>
<filter kube.node.**>
@type record_modifier
<record>
source ${record['node']}
</record>
</filter>
<filter kube.pod.**>
@type record_modifier
<record>
source ${record['node']}/${record['pod-name']}
</record>
</filter>
<filter kube.sys-container.**>
@type record_modifier
<record>
source ${record['node']}/${record['pod-name']}/${record['name']}
</record>
</filter>
<filter kube.container.**>
@type record_modifier
<record>
source ${record['node']}/${record['pod-name']}/${record['container-name']}
</record>
</filter>
# = custom filters specified by users =
<match kube.**>
@type splunk_hec
data_type metric
metric_name_key metric_name
metric_value_key value
protocol https
hec_host {{ splunk_hec_host }}
hec_port {{ splunk_port }}
hec_token "#{ENV['SPLUNK_HEC_TOKEN']}"
host "#{ENV['NODE_NAME']}"
index {{ splunk_metrics_index }}
source ${tag}
insecure_ssl true
<buffer>
@type memory
chunk_limit_records 10000
chunk_limit_size 100m
flush_interval 5s
flush_thread_count 1
overflow_action block
retry_max_times 3
total_limit_size 400m
</buffer>
</match>
@puzzlepri for this bug, no change is needed on the configmap.
@matthewmodestino i had to change the hard coded endpoint from /stats/
to /stats
.
would making this api uri's configurable as a param be a good improvement? (being future proof?)
@puzzlepri for this bug, no change is needed on the configmap. @matthewmodestino i had to change the hard coded endpoint from
/stats/
to/stats
. would making this api uri's configurable as a param be a good improvement? (being future proof?)
@rockb1017 - tested with your image and still getting same SSL error
@puzzlepri Could you share the entire error log you are getting? and which kubelet port are you using?
@rockb1017
Here is the error, kublet port is 10250, where as my splunk-kubernetes-metrics-agg is running fine and scraping resource usage metrics
2020-07-22 18:52:44 +0000 [info]: starting fluentd-1.9.1 pid=1 ruby="2.5.5"
2020-07-22 18:52:44 +0000 [info]: spawn command to main: cmdline=["/usr/bin/ruby", "-Eascii-8bit:ascii-8bit", "-r/usr/local/share/gems/gems/bundler-2.1.4/lib/bundler/setup", "/usr/bin/fluentd", "-c", "/fluentd/etc/fluent.conf", "--under-supervisor"]
2020-07-22 18:52:47 +0000 [info]: adding filter pattern="kube.**" type="record_modifier"
2020-07-22 18:52:47 +0000 [info]: adding filter pattern="kube.node.**" type="record_modifier"
2020-07-22 18:52:47 +0000 [info]: adding filter pattern="kube.pod.**" type="record_modifier"
2020-07-22 18:52:47 +0000 [info]: adding filter pattern="kube.sys-container.**" type="record_modifier"
2020-07-22 18:52:47 +0000 [info]: adding filter pattern="kube.container.**" type="record_modifier"
2020-07-22 18:52:47 +0000 [info]: adding match pattern="kube.**" type="splunk_hec"
2020-07-22 18:52:48 +0000 [info]: adding source type="kubernetes_metrics"
2020-07-22 18:52:49 +0000 [info]: #0 Use URL http://<ip>:10250/stats/summary for creating client to query kubelet summary api
2020-07-22 18:52:49 +0000 [info]: #0 Use URL http://<ip>:10250/stats for creating client to query kubelet stats api
2020-07-22 18:52:49 +0000 [info]: #0 Use URL http://<ip>:10250/metrics/cadvisor for creating client to query cadvisor metrics api
2020-07-22 18:52:49 +0000 [debug]: #0 No fluent logger for internal event
2020-07-22 18:52:49 +0000 [warn]: parameter 'cluster_name' in <source>
@type kubernetes_metrics
tag "kube.*"
node_name "<nodename>t"
use_rest_client_ssl false
cluster_name <clustermame>
interval 15s
</source> is not used.
2020-07-22 18:52:49 +0000 [info]: #0 starting fluentd worker pid=11 ppid=1 worker=0
2020-07-22 18:52:49 +0000 [debug]: #0 buffer started instance=47126338487580 stage_size=0 queue_size=0
2020-07-22 18:52:49 +0000 [info]: #0 fluentd worker is now running worker=0
2020-07-22 18:52:49 +0000 [debug]: #0 enqueue_thread actually running
2020-07-22 18:52:50 +0000 [debug]: #0 flush_thread actually running
2020-07-22 18:53:04 +0000 [error]: #0 Unexpected error raised. Stopping the timer. title=:metric_scraper error_class=RestClient::BadRequest error="400 Bad Request"
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/abstract_response.rb:249:in `exception_with_response'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/abstract_response.rb:129:in `return!'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:836:in `process_result'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:743:in `block in transmit'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/ruby/net/http.rb:910:in `start'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:727:in `transmit'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:163:in `execute'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:63:in `execute'
2020-07-22 18:53:04 +0000 [error]: #0 /opt/app-root/src/gem/fluent-plugin-kubernetes-metrics-1.1.3/lib/fluent/plugin/in_kubernetes_metrics.rb:635:in `scrape_metrics'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/timer.rb:80:in `on_timer'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run_once'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/event_loop.rb:93:in `block in start'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2020-07-22 18:53:04 +0000 [error]: #0 Timer detached. title=:metric_scraper
2020-07-22 18:53:04 +0000 [error]: #0 Unexpected error raised. Stopping the timer. title=:stats_metric_scraper error_class=RestClient::BadRequest error="400 Bad Request"
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/abstract_response.rb:249:in `exception_with_response'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/abstract_response.rb:129:in `return!'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:836:in `process_result'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:743:in `block in transmit'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/ruby/net/http.rb:910:in `start'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:727:in `transmit'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:163:in `execute'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:63:in `execute'
2020-07-22 18:53:04 +0000 [error]: #0 /opt/app-root/src/gem/fluent-plugin-kubernetes-metrics-1.1.3/lib/fluent/plugin/in_kubernetes_metrics.rb:647:in `scrape_stats_metrics'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/timer.rb:80:in `on_timer'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run_once'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/event_loop.rb:93:in `block in start'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2020-07-22 18:53:04 +0000 [error]: #0 Timer detached. title=:stats_metric_scraper
2020-07-22 18:53:04 +0000 [error]: #0 Unexpected error raised. Stopping the timer. title=:cadvisor_metric_scraper error_class=RestClient::BadRequest error="400 Bad Request"
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/abstract_response.rb:249:in `exception_with_response'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/abstract_response.rb:129:in `return!'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:836:in `process_result'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:743:in `block in transmit'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/ruby/net/http.rb:910:in `start'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:727:in `transmit'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:163:in `execute'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:63:in `execute'
2020-07-22 18:53:04 +0000 [error]: #0 /opt/app-root/src/gem/fluent-plugin-kubernetes-metrics-1.1.3/lib/fluent/plugin/in_kubernetes_metrics.rb:660:in `scrape_cadvisor_metrics'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/timer.rb:80:in `on_timer'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run_once'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/event_loop.rb:93:in `block in start'
2020-07-22 18:53:04 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2020-07-22 18:53:04 +0000 [error]: #0 Timer detached. title=:cadvisor_metric_scraper
We're having the same problem around here, using the image rock1017/k8s-metrics:1.1.3-2
didn't help.
Log Messages of the metrics pod, should also include the config used:
`/opt/app-root/src` is not writable.
Bundler will use `/tmp/bundler20200728-1-gnv4xv1' as your home directory temporarily.
2020-07-28 08:54:31 +0000 [info]: parsing config file is succeeded path="/fluentd/etc/fluent.conf"
2020-07-28 08:54:31 +0000 [info]: gem 'fluentd' version '1.9.1'
2020-07-28 08:54:31 +0000 [info]: gem 'fluent-plugin-jq' version '0.5.1'
2020-07-28 08:54:31 +0000 [info]: gem 'fluent-plugin-kubernetes-metrics' version '1.1.3'
2020-07-28 08:54:31 +0000 [info]: gem 'fluent-plugin-kubernetes_metadata_filter' version '2.4.2'
2020-07-28 08:54:31 +0000 [info]: gem 'fluent-plugin-prometheus' version '1.7.0'
2020-07-28 08:54:31 +0000 [info]: gem 'fluent-plugin-record-modifier' version '2.1.0'
2020-07-28 08:54:31 +0000 [info]: gem 'fluent-plugin-splunk-hec' version '1.2.1'
2020-07-28 08:54:33 +0000 [info]: Use URL https://IP:10250/stats/summary for creating client to query kubelet summary api
2020-07-28 08:54:33 +0000 [info]: Use URL https://IP:10250/stats for creating client to query kubelet stats api
2020-07-28 08:54:33 +0000 [info]: Use URL https://IP:10250/metrics/cadvisor for creating client to query cadvisor metrics api
2020-07-28 08:54:33 +0000 [info]: using configuration file: <ROOT>
<system>
log_level info
</system>
<source>
@type kubernetes_metrics
tag "kube.*"
node_name "nodename"
use_rest_client_ssl true
cluster_name core
interval 15s
</source>
<filter kube.**>
@type record_modifier
<record>
metric_name ${tag}
cluster_name core
</record>
</filter>
<filter kube.node.**>
@type record_modifier
<record>
source ${record['node']}
</record>
</filter>
<filter kube.pod.**>
@type record_modifier
<record>
source ${record['node']}/${record['pod-name']}
</record>
</filter>
<filter kube.sys-container.**>
@type record_modifier
<record>
source ${record['node']}/${record['pod-name']}/${record['name']}
</record>
</filter>
<filter kube.container.**>
@type record_modifier
<record>
source ${record['node']}/${record['pod-name']}/${record['container-name']}
</record>
</filter>
<match kube.**>
@type splunk_hec
data_type metric
metric_name_key "metric_name"
metric_value_key "value"
protocol https
hec_host "https://internal.domain.eu"
hec_port 8088
hec_token ""
host "hostname"
index "em_metrics"
source "${tag}"
insecure_ssl true
<buffer>
@type "memory"
chunk_limit_records 10000
chunk_limit_size 100m
flush_interval 5s
flush_thread_count 1
overflow_action block
retry_max_times 3
total_limit_size 400m
</buffer>
</match>
</ROOT>
2020-07-28 08:54:33 +0000 [info]: starting fluentd-1.9.1 pid=1 ruby="2.5.5"
2020-07-28 08:54:33 +0000 [info]: spawn command to main: cmdline=["/usr/bin/ruby", "-Eascii-8bit:ascii-8bit", "-r/usr/local/share/gems/gems/bundler-2.1.4/lib/bundler/setup", "/usr/bin/fluentd", "-c", "/fluentd/etc/fluent.conf", "--under-supervisor"]
2020-07-28 08:54:36 +0000 [info]: adding filter pattern="kube.**" type="record_modifier"
2020-07-28 08:54:36 +0000 [info]: adding filter pattern="kube.node.**" type="record_modifier"
2020-07-28 08:54:36 +0000 [info]: adding filter pattern="kube.pod.**" type="record_modifier"
2020-07-28 08:54:36 +0000 [info]: adding filter pattern="kube.sys-container.**" type="record_modifier"
2020-07-28 08:54:37 +0000 [info]: adding filter pattern="kube.container.**" type="record_modifier"
2020-07-28 08:54:37 +0000 [info]: adding match pattern="kube.**" type="splunk_hec"
2020-07-28 08:54:37 +0000 [info]: adding source type="kubernetes_metrics"
2020-07-28 08:54:38 +0000 [info]: #0 Use URL https://IP:10250/stats/summary for creating client to query kubelet summary api
2020-07-28 08:54:38 +0000 [info]: #0 Use URL https://IP:10250/stats for creating client to query kubelet stats api
2020-07-28 08:54:38 +0000 [info]: #0 Use URL https://IP:10250/metrics/cadvisor for creating client to query cadvisor metrics api
2020-07-28 08:54:38 +0000 [warn]: parameter 'cluster_name' in <source>
@type kubernetes_metrics
tag "kube.*"
node_name "nodename"
use_rest_client_ssl true
cluster_name core
interval 15s
</source> is not used.
2020-07-28 08:54:38 +0000 [info]: #0 starting fluentd worker pid=10 ppid=1 worker=0
2020-07-28 08:54:38 +0000 [info]: #0 fluentd worker is now running worker=0
2020-07-28 08:54:53 +0000 [error]: #0 Unexpected error raised. Stopping the timer. title=:metric_scraper error_class=RestClient::SSLCertificateNotVerified error="SSL_connect returned=1 errno=0 state=error: certificate verify failed (self signed certificate in certificate chain)"
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:776:in `rescue in transmit'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:651:in `transmit'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:163:in `execute'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:63:in `execute'
2020-07-28 08:54:53 +0000 [error]: #0 /opt/app-root/src/gem/fluent-plugin-kubernetes-metrics-1.1.3/lib/fluent/plugin/in_kubernetes_metrics.rb:635:in `scrape_metrics'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/timer.rb:80:in `on_timer'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run_once'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/event_loop.rb:93:in `block in start'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2020-07-28 08:54:53 +0000 [error]: #0 Timer detached. title=:metric_scraper
2020-07-28 08:54:53 +0000 [error]: #0 Unexpected error raised. Stopping the timer. title=:stats_metric_scraper error_class=RestClient::SSLCertificateNotVerified error="SSL_connect returned=1 errno=0 state=error: certificate verify failed (self signed certificate in certificate chain)"
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:776:in `rescue in transmit'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:651:in `transmit'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:163:in `execute'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:63:in `execute'
2020-07-28 08:54:53 +0000 [error]: #0 /opt/app-root/src/gem/fluent-plugin-kubernetes-metrics-1.1.3/lib/fluent/plugin/in_kubernetes_metrics.rb:647:in `scrape_stats_metrics'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/timer.rb:80:in `on_timer'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run_once'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/event_loop.rb:93:in `block in start'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2020-07-28 08:54:53 +0000 [error]: #0 Timer detached. title=:stats_metric_scraper
2020-07-28 08:54:53 +0000 [error]: #0 Unexpected error raised. Stopping the timer. title=:cadvisor_metric_scraper error_class=RestClient::SSLCertificateNotVerified error="SSL_connect returned=1 errno=0 state=error: certificate verify failed (self signed certificate in certificate chain)"
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:776:in `rescue in transmit'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:651:in `transmit'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:163:in `execute'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/rest-client-2.1.0/lib/restclient/request.rb:63:in `execute'
2020-07-28 08:54:53 +0000 [error]: #0 /opt/app-root/src/gem/fluent-plugin-kubernetes-metrics-1.1.3/lib/fluent/plugin/in_kubernetes_metrics.rb:660:in `scrape_cadvisor_metrics'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/timer.rb:80:in `on_timer'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run_once'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/cool.io-1.6.0/lib/cool.io/loop.rb:88:in `run'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/event_loop.rb:93:in `block in start'
2020-07-28 08:54:53 +0000 [error]: #0 /usr/share/gems/gems/fluentd-1.9.1/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2020-07-28 08:54:53 +0000 [error]: #0 Timer detached. title=:cadvisor_metric_scraper
I've tried setting different values for hec_host "https://internal.domain.eu"
. Same outcome if I use https://
or not at the beginning.
It is not the cert on HEC side, its the cert in the INPUT side!!
Please set insecureSSL to true in the kubernetes
section of the metrics chart!
If you are struggling, hit me up in community slack (splk.it/slack - @mattymo)in #kubernetes channel
Thanks to @matthewmodestino I've fixed my problem.
I had only defined global.kubernetes.insecureSSL
(which isn't a thing apparently), but you have to define kubernetes.insecureSSL
additionally in the section for metrics, the value global.kubernetes.insecureSSL is not passed on in this case!
@matthewmodestino - I do have insecureSSL to true in my value.yaml.
@matthewmodestino @rockb1017 Any inputs on the issue?
@puzzlepri You likely still have incorrect settings in your values.yaml
or you need to restart the pods after re-apply. Your errors look like you are calling the wrong ports..
There are various insecureSSL parameters and port options you need to set properly.
The one you need is on the INPUT side, has nothing to do with HEC and should be set in the local metrics chart as seen here:
please post your entire values.yaml
or ping me on slack so I can explain.
Thanks @matthewmodestino , here is my value.yaml and using version splunk-connect-for-kubernetes-1.3.0.tgz
global:
logLevel: debug
splunk:
hec:
insecureSSL: true
host: <host_name>
port: 8088
token: <splunk_token>
indexName: main
kubernetes:
clusterName: "dev1"
openshift: true
## Enabling logging will install the `splunk-kubernetes-logging` chart to a kubernetes
## cluster to collect logs generated in the cluster to a Splunk indexer/indexer cluster.
logging:
enabled: true
## Enabling objects will install the `splunk-kubernetes-objects` chart to a kubernetes
## cluster to collect kubernetes objects in the cluster to a Splunk indexer/indexer cluster.
objects:
enabled: true
## Enabling metrics will install the `splunk-kubernetes-metrics` chart to a kubernetes
## cluster to collect metrics of the cluster to a Splunk indexer/indexer cluster.
metrics:
enabled: true
aggregatorBuffer:
"@type": memory
total_limit_size: 400m
chunk_limit_size: 100m
chunk_limit_records: 10000
flush_interval: 5s
flush_thread_count: 1
overflow_action: block
retry_max_times: 3
# Configure how often SCK pulls metrics for its kubenetes sources. 15s is the defa
metricsInterval: 15s
splunk-kubernetes-logging:
splunk:
hec:
insecureSSL: true
host: <host_name>
port: 8088
token: <splunk_token>
indexName: k8s_logs
containers:
logFormatType: cri
logFormat: "%Y-%m-%dT%H:%M:%S.%N%:z"
serviceAccount:
create: true
splunk-kubernetes-objects:
splunk:
hec:
insecureSSL: true
host: <host_name>
port: 8088
token: <splunk_token>
indexName: k8s_objects
serviceAccount:
create: true
you have no kubernetes section in your metrics and objects sections of your yaml, so you are getting defaults.
What flavour of k8s are you using? Openshift or OSS?
Please review this example and set the proper settings for your cluster!
I am using Openshift 4
then please set your kubelet_port
is set to 10250 and that use_rest_client_ssl
is true and insecureSSL
is false. These are all sub settings of the metrics chart under kubernetes
Do not set in global...
closing. Thank you! feel free to reopen if you need further help.
What happened:
splunk-kubernetes-metrics crashes
What you expected to happen:
Generate metrics to splunk cloud
How to reproduce it (as minimally and precisely as possible):
Deploy using default settings
Anything else we need to know?:
Environment:
kubectl version
):Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.10", GitCommit:"1bea6c00a7055edef03f1d4bb58b773fa8917f11", GitTreeState:"clean", BuildDate:"2020-02-11T20:05:26Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}
ruby --version
): N/Acat /etc/os-release
): N/A