The Bug
As the title mentions, we are missing ContainerInsights metrics when collection_interval is set to 300s, particularly the cpu metrics.
We followed the instructions outlined in the documentation resulting in a configmap which looks similar to the eks example infra file. Attaching only the configMap here for reference:
Steps to reproduce
Set the collection interval to 60s, record all the metrics for 15 minutes (duration) at least
Set collection_interval to 300s, record all the metrics for 15 minutes (duration) at least
Result
At an interval of 60 seconds the number of cpu related metrics and memory related metrics will be on par with each other.
However, when the collection interval is set to 300s, there is a sharp decline in the number of cpu related metrics compared to the memory related metrics, sometimes even down to 1 every 15 minutes.
Additional context
We have worked with AWS Premium support and the team verified that they are seeing the same issue on their test EKS cluster as well and we are creating an issue in this github repo based on their recommendation.
The Bug As the title mentions, we are missing ContainerInsights metrics when collection_interval is set to 300s, particularly the cpu metrics.
We followed the instructions outlined in the documentation resulting in a configmap which looks similar to the eks example infra file. Attaching only the configMap here for reference:
Steps to reproduce Set the
collection interval
to60s
, record all the metrics for 15 minutes (duration) at least Setcollection_interval
to300s
, record all the metrics for 15 minutes (duration) at leastResult At an interval of 60 seconds the number of cpu related metrics and memory related metrics will be on par with each other. However, when the collection interval is set to 300s, there is a sharp decline in the number of cpu related metrics compared to the memory related metrics, sometimes even down to 1 every 15 minutes.
Additional context We have worked with AWS Premium support and the team verified that they are seeing the same issue on their test EKS cluster as well and we are creating an issue in this github repo based on their recommendation.