google / cadvisor

Analyzes resource usage and performance characteristics of running containers.
Other
17.25k stars 2.33k forks source link

Multiple Metrics die down in just 5 minutes #3581

Closed jayjoshi64 closed 2 months ago

jayjoshi64 commented 3 months ago

When I start the container, all the data is collected correctly. Slowly one by one the container start disappearing. Within 5 minutes, all the containers are usually gone from the graphs.

These are the counters which are affected:

Aug-20-2024 22-33-42

My setup:

chengjoey commented 2 months ago

could you please provide cadvisor logs

jayjoshi64 commented 2 months ago

Looks like the issue was with scrape_interval. Previously I set it to 30s. but after reducing it to 15s, the issue seems to be resolved. I don't know the reason behind it though.

I also tried following a way to reduce resources from https://github.com/google/cadvisor/issues/2523 which might have helped resolve the issue as well.

edit: one thing I noticed during my own troubleshooting is that the metrics are still available. they just didn't have correct container_name attached. most of the metric started having empty container names. that was the main reason grafana stopped showing them on the dashboard.