troglobit / sysklogd

BSD syslog daemon with syslog()/syslogp() API replacement for Linux, RFC3164 + RFC5424
https://troglobit.com/sysklogd.html
Other
93 stars 20 forks source link

Some logs are lost in syslog file at the early boot #62

Closed jhan2-windriver closed 1 year ago

jhan2-windriver commented 1 year ago

When I use systemd as init process and sysklogd for logging down logs.


journald.conf(/etc/systemd/journald.conf) as below: 
-----
[Journal]
Storage=volatile
ForwardToSyslog=yes
MaxLevelSyslog=debug

and syslogd.service(/lib/systemd/system/syslogd.service):
...
[Service]
EnvironmentFile=-/etc/default/syslogd
ExecStart=/usr/sbin/syslogd -F $SYSLOGD_OPTS -p /run/systemd/journal/syslog
StandardOutput=null
Restart=on-failure
...

After I reboot board, the there are some differences between "journalctl" command output and "/var/volatile/log/syslog" file; I mean there are some logs lost in syslog file, which are just after "Starting Journal Service..." key word or so.

jhan2-windriver commented 1 year ago

"-p /run/systemd/journal/syslog" means create file "/run/systemd/journal/syslog" as one UNIX domain socket;

When I compare the two logs, output from "journalctl" and /var/volatile/log/syslog, below logs are included in output from "journalctl", while they are missing in /var/volatile/log/syslog file.

Apr 20 07:13:31 $Hostname systemd[1]: modprobe@drm.service: Succeeded.
Apr 20 07:13:31 $Hostname systemd[1]: Finished Load Kernel Module drm.
Apr 20 07:13:31 $Hostname systemd[1]: modprobe@fuse.service: Succeeded.
Apr 20 07:13:31 $Hostname systemd[1]: Finished Load Kernel Module fuse.
Apr 20 07:13:31 $Hostname systemd[1]: Finished Load Kernel Modules.
Apr 20 07:13:31 $Hostname systemd[1]: Finished Remount Root and Kernel File Systems.
Apr 20 07:13:31 $Hostname systemd[1]: Mounting FUSE Control File System...
Apr 20 07:13:31 $Hostname systemd[1]: Mounting Kernel Configuration File System...
Apr 20 07:13:31 $Hostname systemd[1]: Condition check resulted in Rebuild Hardware Database being skipped.
Apr 20 07:13:31 $Hostname systemd[1]: Starting Flush Journal to Persistent Storage...
Apr 20 07:13:31 $Hostname systemd[1]: Condition check resulted in Platform Persistent Storage Archival being skipped.
Apr 20 07:13:31 $Hostname systemd[1]: Starting Apply Kernel Variables...
Apr 20 07:13:31 $Hostname systemd[1]: Condition check resulted in Create System Users being skipped.
Apr 20 07:13:31 $Hostname systemd[1]: Starting Create Static Device Nodes in /dev...
Apr 20 07:13:31 $Hostname systemd[1]: Mounted FUSE Control File System.
...
...
jhan2-windriver commented 1 year ago
Analysis:
The socket /run/systemd/journal/syslog is first created by systemd (syslog.socket unit) , 
then syslogd removes and recreates the socket when it starts, some logs are lost during
this short period between "remove and recreate".

Impact of the problem:
Logs are lost. Makes debugging problems during early bootup more difficult.
jhan2-windriver commented 1 year ago

image From the picture, left logs are from "journalctl" command, and right is from syslog. some logs(in black frame) are missing in syslog file.

jhan2-windriver commented 1 year ago

We have tried to fix it, (https://github.com/jhan2-windriver/sysklogd/commit/74b5db492690a6a2e1804a82e3a2f924a9fd714a) adding one option "-S", which make syslogd accept a syslog socket file descriptor from systemd, meanwhile adding this option is syslogd.service file.

syslogd.service: (/lib/systemd/system/syslogd.service):
...
[Service]
EnvironmentFile=-/etc/default/syslogd
**ExecStart=/usr/sbin/syslogd -F $SYSLOGD_OPTS -p /run/systemd/journal/syslog -S**
StandardOutput=null
Restart=on-failure
...

@troglobit , Could you help to check whether it works or not ?

Best Regards, jhan2

troglobit commented 1 year ago

Great work finding and reporting this! I think we can simplify the fix a bit, I'm not too fond of adding more command line switches, and sysklogd tracks FreeBSD syslogd which use the -S option already (FreeBSD man page).

I propose we instead:

  1. Drop the -p /run/systemd/journal/syslog option
  2. Detect if FD 3 is open then use your proposed code

Something like this:

-       if (!pflag)
-               addpeer(&(struct peer) {
-                               .pe_name = _PATH_LOG,
-                               .pe_mode = 0666,
-                       });
+       if (!pflag) {
+               if (fcntl(3, F_GETFD) != -1) {
+                       if (socket_register(3, NULL, unix_cb, NULL) == -1)
+                               err(1, "failed registering syslog.socket (3)");
+               } else {
+                       addpeer(&(struct peer) {
+                                       .pe_name = _PATH_LOG,
+                                       .pe_mode = 0666,
+                               });
+               }
+       }

I've tested this patch to the best of my abilities, systemd is not my primary init system, but it should be functionally equivalent to your patch.

I have a couple of other outstanding fixes to do before the next release (v2.5.0), so it would be great to get your feedback as soon as possible.

troglobit commented 1 year ago

Proposed fix pushed as 7ec64e5, which automatically closed this issue, but feel free to comment and reopen should the problem still persist in your setup.

I will wait a couple of business days (mon-fri) before starting the release process for v2.5.0.

jhan2-windriver commented 1 year ago

Hi, @troglobit I had verified the fixing https://github.com/troglobit/sysklogd/commit/7ec64e5f9c1bc284792d028647fb36ef3e64dff7, it works fine as expected. 👍

Thanks for your helping.

troglobit commented 1 year ago

@jhan2-windriver awesome, thank you! Then I'll proceed with the release process on my end.