tikv / rust-prometheus

Prometheus instrumentation library for Rust applications
Apache License 2.0
1.04k stars 182 forks source link

Support for multiprocess statistics #483

Open EvanCarroll opened 1 year ago

EvanCarroll commented 1 year ago

Currently, the process gathering metrics are not unique to PID. They assume a single PID. This is undesirable as it seems desirable to have metrics that can report on all PIDs in the same proc namespace.

pub async fn metrics() -> impl IntoResponse {
  let encoder = TextEncoder::new();
  let sys = System::new_all();

  for p in sys.processes().keys() {
    let pc = ProcessCollector::new(p.as_u32() as i32,"");
    prometheus::register(Box::new(pc));
  }
  let metric_families = prometheus::gather();
  let mut buffer = vec![];
  encoder.encode(&metric_families, &mut buffer).unwrap();

  Html(buffer)

}

Code like the above registers a different proc collector for each process, but the proccollector's doesn't support that

https://github.com/tikv/rust-prometheus/blob/a11df02f2736e31321fe6d71ca170ac3c67d97f4/src/process_collector.rs#L45

it just outputs the same names and descriptions regardless of the process it's collecting on, worse, it doesn't seem to output the pid iat all.

There should be a mode for multi-process collection so rather than,

# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1048576
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 167
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 20135936
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1679472456
# HELP process_threads Number of OS threads in the process.
# TYPE process_threads gauge
process_threads 21
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1489252352

you get something like,

pid_2938-process_virtual_memory_bytes 1489252352

Here is how the upstream client handles multiproc https://github.com/prometheus/client_python/blob/master/README.md#multiprocess-mode-eg-gunicorn

If nothing else, we should at least document that the rust client only supports single proc: useful if you're trying to track sidecars in container.