prometheus-community / systemd_exporter

Exporter for systemd unit metrics
Apache License 2.0
283 stars 32 forks source link

memory usage stats are incorrect #46

Closed anarcat closed 1 year ago

anarcat commented 2 years ago

I setup this exporter to diagnose OOM conditions on a server, but the output it gives me is inconsistent with the stats I'm getting through other systems. In particular, I just don't see the memory numbers add up to the actual memory usage on the machine.

I'm not sure, but I think this might be related to #2 except that I don't think this is just a small adjustment that can be made to switch to cgroups: the current stats just don't work in any meaningful way, so I think they're just buggy.

just to give an example, right now, postgres is taking up 2.3GB of memory according to systemctl:

root@materculae:~# systemctl status postgresql@13-main.service 
● postgresql@13-main.service - PostgreSQL Cluster 13-main
     Loaded: loaded (/lib/systemd/system/postgresql@.service; enabled-runtime; vendor preset: enabled)
     Active: active (running) since Tue 2022-05-10 12:23:23 UTC; 2h 51min ago
    Process: 468675 ExecStart=/usr/bin/pg_ctlcluster --skip-systemctl-redirect 13-main start (code=exited, status=0/SUCCESS)
   Main PID: 468680 (postgres)
      Tasks: 13 (limit: 4675)
     Memory: 2.3G
        CPU: 23min 30.783s
     CGroup: /system.slice/system-postgresql.slice/postgresql@13-main.service
             ├─468680 /usr/lib/postgresql/13/bin/postgres -D /var/lib/postgresql/13/main -c config_file=/etc/postgresql/13/main/postgresql.conf
             ├─468691 postgres: checkpointer
             ├─468692 postgres: background writer
             ├─468693 postgres: walwriter
             ├─468694 postgres: autovacuum launcher
             ├─468695 postgres: archiver last was 000000010000053F0000008F
             ├─468696 postgres: stats collector
             ├─468697 postgres: logical replication launcher
             ├─468734 postgres: prometheus postgres [local] idle
             ├─469022 postgres: exonerator-web exonerator 127.0.0.1(58176) idle
             ├─469071 postgres: exonerator-web exonerator 127.0.0.1(58188) idle
             ├─471129 postgres: exonerator-web exonerator 127.0.0.1(58370) idle
             └─471183 postgres: exonerator exonerator 127.0.0.1(58376) SELECT

[...]

... but the exporter is only reporting 21MB RSS and 560MB VSS, so it's obviously way off:

root@materculae:~# curl -s localhost:9558/metrics | grep postgresql@ | grep memory
systemd_process_resident_memory_bytes{name="postgresql@13-main.service"} 2.0967424e+07
systemd_process_virtual_memory_bytes{name="postgresql@13-main.service"} 5.60779264e+08
systemd_process_virtual_memory_max_bytes{name="postgresql@13-main.service"} -1

i used this tool to track down this issue we're facing but it seems like, unfortunately, i'll have to look elsewhere...

thanks for any clarification.

anarcat commented 2 years ago

to expand on this, looking at this:

https://github.com/povilasv/systemd_exporter/blob/d4b06488e59ab3e18ea59a5bd9a7d3c86e894356/systemd/systemd.go#L483-L485

it seems the problem is that we're listing only the main PID which obviously fails for cases like postgresql or apache (which starts multiple processes) or cron jobs (which necessarily start a subprocess).

so i guess it's separate from #2 in the sense that it could be fixed by implementing the above TODO and just add up the memory of all the processes in the slice by hand, without having to go through reimplementing everything with cgroups, which seems to be stalled in #10...

oseiberts11 commented 1 year ago

The README.md file promises "If you've chosen to pack 400 threads and 20 processes inside the mysql.service, we will only export metrics on the service unit, not on the individual tasks.". This is absolutely not true (if I were less charitable I would call it a lie).

oseiberts11 commented 1 year ago

I created a merge request #65 to fix the README.md so other people don't rely on information that the exporter does not provide.

SuperQ commented 1 year ago

We should probably fix this collector to not work the way it does. IMO, it's probably something we should just delete until it works the way users expect.

oseiberts11 commented 1 year ago

67 is probably doing what's expected here.

SuperQ commented 1 year ago

I've decided that these metrics are not worth maintaining in this exporter. cgroup-based metrics can be gathered using cAdvisor.

evgeni commented 1 year ago

I've opened https://github.com/prometheus-community/systemd_exporter/pull/87 which exposes systemds own memory metrics, which are a) accurate, b) cheap for us to obtain :)