Open RainofTerra opened 2 months ago
Pinging code owners:
receiver/kubeletstats: @dmitryax @TylerHelmuth @ChrsMark
See Adding Labels via Comments if you do not have permissions to add labels yourself.
It would be useful to be able to take something like system.disk.operations and group it by pod name and container name. Currently we can only get it for the overall node. This would let us do things like monitor the iO of individual containers (we have both a reader and a writer container, we'd like to be able to see their IO separately).
If I understand this correctly the proposal is to emit a metric called system.disk.operations
with proper container and k8s metadata as attributes?
My concern here is that we should first come up with a valid data model. At the moment the system.*
namespace is supposed to be used for metrics that are related to a system/host/vm etc as a whole. Then we have process.*
namespace for per process metrics. So in that case I assume we should emit per container/pod metrics, right?
On another note, I wonder if this metric can come directly by scraping the cadvisor's prometheus endpoint: https://github.com/google/cadvisor/blob/master/docs/storage/prometheus.md#prometheus-container-metrics. In that case that would be already possible by using the prometheus receiver?
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
Component(s)
receiver/hostmetrics, receiver/kubeletstats
Is your feature request related to a problem? Please describe.
In the past we have used something like telegraf with an iostats plugin to monitor system-wide I/O statistics (IOPS, throughput, etc.) on servers running high I/O services (like our internal datastore, or Kafka). In Kubernetes (we're using EKS) that data is available at the various cgroup levels with io.stat. Pod level:
Container level:
Describe the solution you'd like
It would be useful to be able to take something like
system.disk.operations
and group it by pod name and container name. Currently we can only get it for the overall node. This would let us do things like monitor the iO of individual containers (we have both a reader and a writer container, we'd like to be able to see their IO separately).Describe alternatives you've considered
No response
Additional context
No response