Closed yashwanth-mannem closed 1 week ago
/kind support /remove-kind bug /triage accepted /assign
This issue has not been updated in over 1 year, and should be re-triaged.
You can:
/triage accepted
(org members only)/close
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/
/remove-triage accepted
@yashwanth-mannem sorry no one was able to get to this issue. If you are still experiencing the problem, feel free to re-open. /close
@dashpole: Closing this issue.
What happened?:
During communication between the API server and the Prometheus adapter, a certificate validation error was observed. The error indicated that the certificate was expired. However, upon checking, the certificate in use was found to be valid and not expired.
What did you expect to happen?:
The Prometheus adapter should have successfully authenticated with the Kubernetes API server using the valid certificate, and metrics retrieval should have been successful.
Please provide the prometheus-adapter config:
prometheus-adapter config
apiVersion: v1 data: config.yaml: | resourceRules: cpu: containerLabel: container containerQuery: | sum by (<<.GroupBy>>) ( irate ( container_cpu_usage_seconds_total{<<.LabelMatchers>>,container!="",pod!=""}[120s] ) ) nodeQuery: | sum by (<<.GroupBy>>) ( 1 - irate( node_cpu_seconds_total{mode="idle"}[60s] ) ) or sum by (<<.GroupBy>>) ( node:windows_node_cpu_utilisation:avg5m{mode="idle",job="wmi-exporter",<<.LabelMatchers>>} ) resources: overrides: instance: resource: node namespace: resource: namespace pod: resource: pod memory: containerLabel: container containerQuery: | sum by (<<.GroupBy>>) ( container_memory_working_set_bytes{<<.LabelMatchers>>,container!="",pod!=""} ) nodeQuery: | sum by (<<.GroupBy>>) ( node_memory_MemTotal_bytes{job="node-exporter",<<.LabelMatchers>>} - node_memory_MemAvailable_bytes{job="node-exporter",<<.LabelMatchers>>} ) or sum by (<<.GroupBy>>) ( node:windows_node_memory_utilization{job="wmi-exporter",<<.LabelMatchers>>} ) resources: overrides: instance: resource: node namespace: resource: namespace pod: resource: pod window: 5m kind: ConfigMap metadata: annotations: helm.fluxcd.io/antecedent: prometheus:helmrelease/prometheus-adapter meta.helm.sh/release-name: prometheus-adapter meta.helm.sh/release-namespace: prometheus creationTimestamp: "2023-06-21T23:04:00Z" labels: app: prometheus-adapter app.kubernetes.io/managed-by: Helm chart: prometheus-adapter-2.6.2 heritage: Helm release: prometheus-adapter name: prometheus-adapter namespace: prometheusPlease provide the HPA resource used for autoscaling:
HPA yaml
Not setup. We are noticing the issue, while executing k top nodes Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)Please provide the HPA status:
NA
Please provide the prometheus-adapter logs with -v=6 around the time the issue happened:
prometheus-adapter logs
E0614 20:26:50.160879 1 authentication.go:53] Unable to authenticate the request due to an error: [x509: certificate has expired or is not yet valid: current time 2023-06-14T20:26:50Z is after 2022-09-20T21:38:27Z, verifying certificate SN=591688138426063623, SKID=, AKID= failed: x509: certificate has expired or is not yet valid: current time 2023-06-14T20:26:50Z is after 2022-09-20T21:38:27Z]Anything else we need to know?:
We have ensured that the certificate is not expired and is located at the correct path as configured in the adapter. We have also verified the synchronization of the Secret/ConfigMap, certificate rotation process, time synchronization across nodes, and the validity of the certificate chain. The problem persists.
Environment: prometheus-adapter version: 2.6.2 prometheus version: v0.38.1 Kubernetes version (use kubectl version): v1.20.5" Cloud provider or hardware configuration: vpshere Other info: Verified allt he dependent manifests and resources; everything looks fine, but do not see metrics api service in the apiregistrations