Open maxout123 opened 2 weeks ago
Job created in config
- job_name: podScrape/mynamespace/target-test/0
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- mynamespace
honor_labels: false
relabel_configs:
- action: drop
source_labels:
- __meta_kubernetes_pod_phase
regex: (Failed|Succeeded)
- action: keep
source_labels:
- __meta_kubernetes_pod_label_app
regex: target-test
- action: keep
source_labels:
- __meta_kubernetes_pod_label_app_kubernetes_io_instance
regex: target-test
- action: keep
source_labels:
- __meta_kubernetes_pod_label_app_kubernetes_io_name
regex: target-test
- action: keep
source_labels:
- __meta_kubernetes_pod_container_port_name
regex: metrics
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- target_label: job
replacement: mynamespace/target-test
- target_label: endpoint
replacement: metrics
- source_labels:
- __meta_kubernetes_pod_name
target_label: instance
- target_label: env
replacement: prod
metric_relabel_configs:
- source_labels:
- __name__
target_label: __name__
replacement: custom_$1
Hi, the entries in the Discovered Targets UI don't affect actual scraping (which correctly stops when you delete the podScrape), but they do persist in the UI.
I understand it don't affect actual scraping, but why keep garbage?
Another example is deleted pods in serviceScrape.
If HPA is working for ingress-nginx-controller as an example, there will be hundreds of pods existing in the past. Why all of them still shown in Discovered Targets?
serviceScrape/monitoring/ingress-nginx-controller/0 (3/370 active)
These 370 ingress pods are deleted by HPA long time ago, but still shown in Discovered Targets.
We have hundreds of serviceScrapes, each of them keeps memory of dead pods forever, Discovered Targets page contains thousands of obsolete records and works very slow. It all looks strange and inconvenient.
Dropped targets' list below is incomplete, because the number of dropped targets exceeds -promscrape.maxDroppedTargets=10000.
message appears on Discovered Targets page, but 9900 of them was deleted long time ago
This seems can be improved. @maxout123, are you still observing relevant scrape config on /config
page after deleting the VMPodScrape or VMServiceScrape?
I see this not as garbage, but as diagnostic information. Retaining the history of dropped targets is important for troubleshooting short-lived targets, as it provides context on why targets were previously dropped. You can adjust the size via the -promscrape.maxDroppedTargets
flag. Previously, entries were evicted based on time, but this has changed to a limit-based eviction.
This seems can be improved. @maxout123, are you still observing relevant scrape config on
/config
page after deleting the VMPodScrape or VMServiceScrape?
No, there are no any mention of deleted scrape configs on /config page, but it still shown on Discovered Targets page. Have no idea why it is needs to be shown while does not exists in config.
I see this not as garbage, but as diagnostic information. Retaining the history of dropped targets is important for troubleshooting short-lived targets, as it provides context on why targets were previously dropped. You can adjust the size via the
-promscrape.maxDroppedTargets
flag. Previously, entries were evicted based on time, but this has changed to a limit-based eviction.
Should I set -promscrape.maxDroppedTargets=0
to hide all deleted pods on /service-discovery page?
No, the discovered targets page shows historical targets for debugging purposes; it does not differentiate between deleted and active pods.
Ok, should I set -promscrape.maxDroppedTargets=0
to hide all dropped targets on /service-discovery page? No unexpected issues regarding to that setting?
Yes, it's safe to do so
Is your question request related to a specific component?
vmagent, kubernetes_sd_discovery
Describe the question in detail
Why does it happen? How to remove obsolete target entries completely?
Troubleshooting docs