linux-ras / ServiceReport

ServiceReport
GNU General Public License v2.0
5 stars 9 forks source link

Add feature to enable 'persistent' kernel logging #4

Open mwbringmann opened 4 years ago

mwbringmann commented 4 years ago

Some distros or configurations do not enable/disable persistent kernel event logging i.e. /var/log/messages content showing events of prior boot cycles. In the event that the system crashes, the messages file prior to the crash is thus lost. It would be useful to add an option to reset the kernel configuration to enable persistent event logging, so that the maximum amount of information about crashes is retained.

sourabhjains commented 4 years ago

This check should be part of a new kernel logging plugin.

Comment 2 in #5

lucianochavez commented 4 years ago

To be more specific, we keep running into issues in SLES 15 where journal log entries do not persist for previous boots, only the current boot. This is not helpful for when the user runs into a problem but reboots and then attempts to save the log. In order to get persistent logging, there seem to be a couple of ways. According to https://access.redhat.com/solutions/696893 for RHEL, you can do # mkdir -p /var/log/journal if Storage=Auto is set in /etc/systemd/journald.conf. With SLES, they say to change Storage=Auto in /etc/systemd/journald.conf to Storage=persistent and then restart the journald service with systemctl restart systemd-journald

sourabhjains commented 4 years ago

Thanks @lucianochavez for the detailed explanation. I think it's better to have a separate plugin to handle journald configuration and it should be a mandatory plugin and always run by default.

A quick note on the mandatory/optional plugins. It's an upcoming feature in ServcieReport where workload-specific plugins, for example, HTX will be marked as optional. The optional plugins will not part of the default run and will require additional arguments to enable them.

ananthmg commented 4 years ago

https://documentation.suse.com/sles/15-SP1/html/SLES-all/cha-journalctl.html#sec-journalctl-persistent is the corresponding SLES documentation.

sourabhjains commented 4 years ago

Thanks @ananthmg for sharing the document.

I did bit of experiments and found that we don't need two different approaches for SLES and RHEL to enable persistent journald logs. Changing the Storage=persistent and rerunning the services is good enough to enable persistent journald logs.

sourabhjains commented 4 years ago

Hello @lucianochavez @mwbringmann

Do you use SystemMaxFileSize or SystemMaxFiles journald config options to limit the journald disk consumption?

If yes then please let me know your preferred values.

mwbringmann commented 4 years ago

No, I have not used these options before. Usually, I have bypassed the configuration issue and grabbed the full log files, directly.

sourabhjains commented 4 years ago

This feature was dependent on the optional plugin feature in ServiceReport.

Will upstream the journalctl plugin to have persistent logging soon.

lucianochavez commented 2 years ago

Hi @sourabhjains

Any outlook on this enhancement? We continue to run across systems with the default in situations where the system was rebooted (due to a crash or hang or reboot test) and thus have lost error log entries from the prior boot.