kubernetes / kubernetes

Production-Grade Container Scheduling and Management
https://kubernetes.io
Apache License 2.0
110.28k stars 39.45k forks source link

We are facing Error scraping metrics tls: failed to verify certificate: x509: certificate signed by unknown authority #125956

Closed shbasha-clgx closed 4 weeks ago

shbasha-clgx commented 3 months ago

Hi Team, We are trying to install otel collector using helm and trying to get the kubernetesAttributes: enabled: true kubeletMetrics: enabled: true hostMetrics: enabled: true logsCollection: enabled: true includeCollectorLogs:

configuration added in values.yaml how ever we are facing below error , attached file for your reference.

Kindly look into this and please assist

error scraperhelper/scrapercontroller.go:197 Error scraping metrics {"kind": "receiver", "name": "kubeletstats", "data_type": "metrics", "error": "Get " [https://xxxx:xxxx/stats/summary":] tls: failed to verify certificate: x509: certificate signed by unknown authority", "scraper": "kubeletstats"}

we do not have self signed cert and need assistance on how to bypass this.

Adarsh-verma-14 commented 3 months ago

/sig node /sig auth /sig instrumentation

Adarsh-verma-14 commented 3 months ago

Hi @shbasha-clgx,

As I understand, you don't have a self-signed certificate and need assistance in bypassing the tls verification. To resolve this issue, you can modify your values.yaml configuration file to disable certificate verification for kubelet metrics. You can achieve this by setting the 'insecure_skip_verify' field to 'true'.

Adarsh-verma-14 commented 3 months ago

you can update your values.yaml file like: values.yaml.txt

shbasha-clgx commented 3 months ago

@Adarsh-verma-14 Hi , i tried that it is throwing below error Error: UPGRADE FAILED: values don't meet the specifications of the schema(s) in the following chart(s): opentelemetry-collector:

Adarsh-verma-14 commented 3 months ago

you can take reference from this page:https://github.com/open-telemetry/opentelemetry-collector/blob/main/config/configtls/README.md. May be it's help for skipping verify tls certificate.

shbasha-clgx commented 2 months ago

@Adarsh-verma-14 Hi , after adding receivers:

kubeletstats: insecure_skip_verify: true

We are not seeing the tls error now but we are not able to get the cluster metrics in the otel collector logs although as exporter we put debug , below is the values.yaml file we are using.

mode: daemonset

presets: kubernetesAttributes: enabled: true kubeletMetrics: enabled: true hostMetrics: enabled: true logsCollection: enabled: true includeCollectorLogs: true

config: exporters: debug: {} # Enable OTLP HTTP exporter service: pipelines: logs: exporters:

nodeSelector: kubernetes.io/hostname: xxxxxx tolerations:

key: "node-role.kubernetes.io/control-plane" effect: "NoSchedule"

Adarsh-verma-14 commented 2 months ago

Hi @shbasha-clgx , you need to ensure that the configuration for collecting and exporting metrics is correct or not

Adarsh-verma-14 commented 2 months ago

I also reproduced it by using this values.yaml file

mode: daemonset

image: repository: "otel/opentelemetry-collector-contrib"
tag: "latest"

presets: kubernetesAttributes: enabled: true kubeletMetrics: enabled: true hostMetrics: enabled: true logsCollection: enabled: true includeCollectorLogs: true

config: receivers: kubeletstats: insecure_skip_verify: true processors: batch: {} exporters: debug: {} # Enable OTLP HTTP exporter service: pipelines: metrics: receivers: [kubeletstats] processors: [batch] exporters: [debug] logs: processors: [batch] exporters: [debug]

nodeSelector: kubernetes.io/hostname: xxxxxx tolerations:

it's still failed from side due to memory_limiter issue. But you can try may be it will for your case or you can take reference from this file for the configuration for collecting and exporting metrics and I am also trying to resolve memory_limiter issue.

shbasha-clgx commented 2 months ago

@Adarsh-verma-14 Thanks , please let me know if you have any update

shbasha-clgx commented 2 months ago

@Adarsh-verma-14 i tried the file you have shared but still we are not able to get the metrics details when we check logs of otel collector pod, kindly assist

dgrisonnet commented 2 months ago

/assign @dashpole /triage accepted

@dashpole this is related to otel collector, could you perhaps guide them to the right place?

dashpole commented 2 months ago

I would recommend posting in the opentelemetry slack channel for the collector: https://app.slack.com/client/T08PSQ7BQ/C01N6P7KR6W, or opening an issue with the collector, and tagging the kubeletstats receiver: https://github.com/open-telemetry/opentelemetry-collector-contrib

The authors of the receiver should be able to help you.

k8s-ci-robot commented 4 weeks ago

@ibihim: Closing this issue.

In response to [this](https://github.com/kubernetes/kubernetes/issues/125956#issuecomment-2338527639): >Thanks for your pull request. > >However, [trivial](https://github.com/kubernetes/community/blob/master/contributors/guide/pull-requests.md#trivial-edits) changes like this should be avoided as we have very limited resources to handle reviews and approvals. Focus efforts on fixes that have meaningful impact on end users (bugs, docs, etc). > >(Sig-Auth Triage decision) > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
ibihim commented 4 weeks ago

This does not appear to be an issue in Kubernetes.

Please try asking on the support channels https://github.com/kubernetes/kubernetes/blob/master/SUPPORT.md

/kind support /close