Closed leemicw closed 5 years ago
Do you have any output from the kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes"
command?
Could you please send the logs from the own metrics-server-exporter container?
@deadc Here is the output from kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes"
{
"kind": "NodeMetricsList",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes"
},
"items": [{
"metadata": {
"name": "aks-linux-node-pool-name",
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/aks-linux-node-pool-name",
"creationTimestamp": "2019-11-07T21:01:07Z"
},
"timestamp": "2019-11-07T21:00:00Z",
"window": "1m0s",
"usage": {
"cpu": "607m",
"memory": "2670668Ki"
}
}, {
"metadata": {
"name": "aks-win-server",
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/aks-win-server",
"creationTimestamp": "2019-11-07T21:01:07Z"
},
"timestamp": "2019-11-07T21:00:00Z",
"window": "1m0s",
"usage": {
"cpu": "80m",
"memory": "1115608Ki"
}
}
]
}
All I have on the console for the metrics-server-exporter is this
The selected container has not logged any messages yet.
Have you tried setting the K8S_ENDPOINT
variable to your API endpoint?
I believe there may be either an API communication or authorization issue with serviceaccount
@deadc I added a pull request (#30) to show the changes I made. After this is closed, it should be deleted.
In the changes I added a namespace so I could control where the permissions were created. After that I was able to see metrics in the output of the pod.
One other issue. I have multiple instances of pods running. In the metrics view I only see one of the pods.
Here is a sample.
# HELP kube_metrics_server_pods_cpu Metrics Server Pods CPU
# TYPE kube_metrics_server_pods_cpu gauge
kube_metrics_server_pods_cpu{pod_container_name="console-runner",pod_name="console-runner-d7b7c4ff-5zjmx",pod_namespace="real-namespace"} 16.0
Would you expect there to be 1 line per pod?
Yes, it's expected, a metric per pod. About the other issue you have mentioned, it's probably a problem with your metrics-server, if the command kubectl top pods -n <namespace>
shows more than one pod, then metrics-server-exporter shoud expose these pods.
Basically what metrics-server-exporter does is collect the data from metrics-server apiserver endpoint (kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes"
) and expose as prometheus metrics.
@deadc thank you for the top pods
command that helped me see I was looking at the wrong cluster. Everything is working perfectly. Thank you so much for your help.
i am using GKE facing the same issue with no output in dashboard.
getting all value on kubectl top pods
also tried metrics exporter and getting values
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 25755648.0
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 21860352.0
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1598950513.21
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 1.3699999999999999
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 6.0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1048576.0
# HELP python_info Python platform information
# TYPE python_info gauge
python_info{implementation="CPython",major="3",minor="6",patchlevel="10",version="3.6.10"} 1.0
# HELP kube_metrics_server_response_time Metrics Server API Response Time
# TYPE kube_metrics_server_response_time gauge
kube_metrics_server_response_time{api_url="https://kubernetes.default.svc/metrics.k8s.io"} 0.072846
# HELP kube_metrics_server_nodes_mem Metrics Server Nodes Memory
# TYPE kube_metrics_server_nodes_mem gauge
# HELP kube_metrics_server_nodes_cpu Metrics Server Nodes CPU
# TYPE kube_metrics_server_nodes_cpu gauge
# HELP kube_metrics_server_pods_mem Metrics Server Pods Memory
# TYPE kube_metrics_server_pods_mem gauge
# HELP kube_metrics_server_pods_cpu Metrics Server Pods CPU
# TYPE kube_metrics_server_pods_cpu gauge
not sure what am i missing
in Prometheus, i have used
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-server-conf
labels:
name: prometheus-server-conf
namespace: monitoring
data:
prometheus.yml: |-
global:
scrape_interval: 5s
evaluation_interval: 5s
rule_files:
- "/etc/prometheus-rules/*.rules"
alerting:
alertmanagers:
- scheme: http
path_prefix: /
static_configs:
- targets: ['alertmanager:9093']
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']
- job_name: 'kubernetes-cadvisor'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc.cluster.local:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
- job_name: 'kubernetes-kube-state'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
- source_labels: [__meta_kubernetes_pod_label_grafanak8sapp]
regex: .*true.*
action: keep
- source_labels: ['__meta_kubernetes_pod_label_daemon', '__meta_kubernetes_pod_node_name']
regex: 'node-exporter;(.*)'
action: replace
target_label: nodename
- job_name: 'kubernetes-kubelet'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc.cluster.local:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: pod
I am sure this is a configuration issue.
Environment
I am running on kubernetes v1.14.6 (in Azure AKS if that matters)
yaml changes
I installed all 3 yaml files from the deployment directory. The only change I made from the master branch are in the deployment yaml under spec/template/spec I added
this is required since we have windows node pools.
I also updated the image tag from
v0.0.5
tov0.0.6
Output
If I port forward to port 8000 on the pod
kubectl --namespace default port-forward metrics-server-exporter-real-pod-name 8000
Then I see the following outputI know these values are being scraped because I can
python_info
in my prometheus server.There are no messages on the console of the container.
Request
Is there a way to enable debug output so I can troubleshoot the issue?