cockpit-project / cockpit

Cockpit is a web-based graphical interface for servers.
http://www.cockpit-project.org/
GNU Lesser General Public License v2.1
11.26k stars 1.12k forks source link

Metrics not loading (problem with pmlogger) #18277

Closed fredlarochelle closed 1 year ago

fredlarochelle commented 1 year ago

Explain what happens

image

As you can see, I get that pmlogger.service is not running. Then, when I try to restart pmlogger I get the following error:

image

It doesn't work either when I try to restart the service using sudo systemctl restart pmlogger. I get Job for pmlogger.service failed because the service did not take the steps required by its unit configuration. See "systemctl status pmlogger.service" and "journalctl -xeu pmlogger.service" for details.

Then trying systemctl status pmlogger.service, I get image

If it is of any help, it was working a couple weeks ago (can't remember when exactly) and apart from updates, only the Intel oneAPI HPC Toolkit was added to the system. Should not be the source of the problem tho...

Any ideas what could the problem be? Thanks!

Version of Cockpit

283

Where is the problem in Cockpit?

Metrics

Server operating system

Fedora

Server operating system version

Fedora 37

What browsers are you using?

Chrome

System log

Here is the log with `journalctl --since -5m`:
Jan 31 17:40:07 dev1 systemd[1]: pmlogger.service: Failed with result 'protocol'.
Jan 31 17:40:07 dev1 systemd[1]: Failed to start pmlogger.service - Performance Metrics Archive Logger.
Jan 31 17:40:07 dev1 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmlogger comm="systemd" exe="/usr/lib/systemd/system>
Jan 31 17:40:07 dev1 systemd[1]: Starting pmlogger_farm.service - pmlogger farm service...
Jan 31 17:40:07 dev1 systemd[1]: Started pmlogger_farm.service - pmlogger farm service.
Jan 31 17:40:07 dev1 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmlogger_farm comm="systemd" exe="/usr/lib/systemd/s>
Jan 31 17:40:08 dev1 systemd[1]: pmlogger.service: Scheduled restart job, restart counter is at 18.
Jan 31 17:40:08 dev1 systemd[1]: Stopping pmlogger_farm.service - pmlogger farm service...
Jan 31 17:40:08 dev1 systemd[1]: pmlogger_farm.service: Deactivated successfully.
Jan 31 17:40:08 dev1 systemd[1]: Stopped pmlogger_farm.service - pmlogger farm service.
Jan 31 17:40:08 dev1 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmlogger_farm comm="systemd" exe="/usr/lib/systemd/sy>
Jan 31 17:40:08 dev1 systemd[1]: Stopped pmlogger.service - Performance Metrics Archive Logger.
Jan 31 17:40:08 dev1 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmlogger comm="systemd" exe="/usr/lib/systemd/system>
Jan 31 17:40:08 dev1 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmlogger comm="systemd" exe="/usr/lib/systemd/systemd>
Jan 31 17:40:15 dev1 tailscaled[1119]: Accept: TCP{[IP_redacted] > [IP_redacted]} 89 ok out
Jan 31 17:40:25 dev1 tailscaled[1119]: Accept: TCP{[IP_redacted] > [IP_redacted]} 89 ok out
Jan 31 17:40:35 dev1 tailscaled[1119]: Accept: TCP{[IP_redacted] > [IP_redacted]} 52 tcp non-syn
Jan 31 17:40:45 dev1 tailscaled[1119]: Accept: TCP{[IP_redacted] > [IP_redacted]} 52 tcp non-syn
Jan 31 17:40:55 dev1 tailscaled[1119]: Accept: TCP{[IP_redacted] > [IP_redacted]} 52 tcp non-syn
Jan 31 17:41:02 dev1 root[15188]: pmcd_wait failed in /usr/libexec/pcp/lib/pmcd: exit status: 2
Jan 31 17:41:02 dev1 systemd[1]: pmcd.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Jan 31 17:41:02 dev1 systemd[1]: pmcd.service: Failed with result 'exit-code'.
Jan 31 17:41:02 dev1 systemd[1]: Failed to start pmcd.service - Performance Metrics Collector Daemon.
Jan 31 17:41:02 dev1 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmcd comm="systemd" exe="/usr/lib/systemd/systemd" h>
Jan 31 17:41:02 dev1 systemd[1]: Starting pmlogger.service - Performance Metrics Archive Logger...
Jan 31 17:41:02 dev1 systemd[1]: pmcd.service: Scheduled restart job, restart counter is at 19.
Jan 31 17:41:02 dev1 systemd[1]: Stopped pmcd.service - Performance Metrics Collector Daemon.
Jan 31 17:41:02 dev1 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmcd comm="systemd" exe="/usr/lib/systemd/systemd" h>
Jan 31 17:41:02 dev1 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmcd comm="systemd" exe="/usr/lib/systemd/systemd" ho>
Jan 31 17:41:02 dev1 systemd[1]: Starting pmcd.service - Performance Metrics Collector Daemon...
Jan 31 17:41:05 dev1 tailscaled[1119]: Accept: TCP{[IP_redacted] > [IP_redacted]} 52 tcp non-syn
Jan 31 17:41:08 dev1 systemd[1]: pmlogger.service: Failed with result 'protocol'.
Jan 31 17:41:08 dev1 systemd[1]: Failed to start pmlogger.service - Performance Metrics Archive Logger.
Jan 31 17:41:08 dev1 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmlogger comm="systemd" exe="/usr/lib/systemd/system>
Jan 31 17:41:08 dev1 systemd[1]: Starting pmlogger_farm.service - pmlogger farm service...
Jan 31 17:41:08 dev1 systemd[1]: Started pmlogger_farm.service - pmlogger farm service.
marusak commented 1 year ago

I believe this is due to https://bugzilla.redhat.com/show_bug.cgi?id=2013937

Can you check your version of pcp and try to update it?

fredlarochelle commented 1 year ago

Running dnf info pcp, I get that I am running pcp 6.0.1 which, from their Github repo seems to be the latest release.

image

Also, running dnf info cockpit-pcp, I get the following (Cockpit was also updated to version 284 since yesterday and the problem is still present).

image

Should I file a new bug report with Red Hat, since this one dates back to October 2021 and was closed in May 2022?

marusak commented 1 year ago

Should I file a new bug report with Red Hat, since this one dates back to October 2021 and was closed in May 2022?

Yes please, this does not seem related to cockpit at all since reproducer is as simple as sudo systemctl restart pmlogger

bitbull06 commented 1 year ago

Any chance this could be updated or a solution provided? Still having this issue here, on RHEL 8.7 with pcp 5.3.7 and cockpit-pcp 276 ...

fredlarochelle commented 1 year ago

Did reinstall Fedora at some point, I haven't had a problem since.

bitbull06 commented 1 year ago

Yes, but reinstalling os entirely in an enterprise environment is not really an option (for everybody) I guess. Still hoping for a less drastic outcome.


From: Frédéric Larochelle @.> Sent: Tuesday, April 18, 2023 5:45:07 PM To: cockpit-project/cockpit @.> Cc: bitbull06 @.>; Comment @.> Subject: Re: [cockpit-project/cockpit] Metrics not loading (problem with pmlogger) (Issue #18277)

Did reinstall Fedora at some point, I haven't had a problem since.

— Reply to this email directly, view it on GitHubhttps://github.com/cockpit-project/cockpit/issues/18277#issuecomment-1513390455, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AB2YNERNIOKX4OWQAPUUWRLXB2ZIHANCNFSM6AAAAAAUNAIBFM. You are receiving this because you commented.Message ID: @.***>

martinpitt commented 1 year ago

@bitbull06 Well, it was fixed with a PCP update in RHEL 9 as well, almost a year ago? https://access.redhat.com/errata/RHBA-2022:2370

bitbull06 commented 1 year ago

Great for RHEL9, but still using 8(.7) here ...


From: Martin Pitt @.> Sent: Thursday, April 20, 2023 3:46:29 PM To: cockpit-project/cockpit @.> Cc: bitbull06 @.>; Mention @.> Subject: Re: [cockpit-project/cockpit] Metrics not loading (problem with pmlogger) (Issue #18277)

@bitbull06https://github.com/bitbull06 Well, it was fixed with a PCP update in RHEL 9 as well, almost a year ago? https://access.redhat.com/errata/RHBA-2022:2370

— Reply to this email directly, view it on GitHubhttps://github.com/cockpit-project/cockpit/issues/18277#issuecomment-1516359957, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AB2YNEQBFDYHSYUT5XFFIULXCE43LANCNFSM6AAAAAAUNAIBFM. You are receiving this because you were mentioned.Message ID: @.***>

martinpitt commented 1 year ago

Yeah, sorry.. that must be fixed in PCP. Perhaps if you report it against RHEL 8 in bugzilla?

UbuntuRunner commented 10 months ago

chown -R cockpit-wsinstance /var/log/pcp/pmlogger/

(So exactly the opposite way compared to the suggestion at https://forums.fedoraforum.org/showthread.php?331674-pmlogger-not-starting-after-Fedora-IoT-upgrade-to-39&p=1877886)

FeelTheLemon commented 1 week ago

Faced the same problem with PCP 6.3.1 after system upgrade pmlogger is running under pcp user and files in /var/log/pcp/pmlogger already owned by pcp

Removed everything there and restarted pmlogger and it finally started to show metrics