azlux / log2ram

ramlog like for systemd (Put log into a ram folder)
MIT License
2.62k stars 194 forks source link

Log2ram timeout during boot. #139

Open wilwet opened 3 years ago

wilwet commented 3 years ago

When booting the following error appears. under Raspian uname -a Linux p1mon 5.4.72-v7+ #1356 SMP Thu Oct 22 13:56:54 BST 2020 armv7l GNU/Linux

[ 68.459184] systemd[1]: log2ram.service: Start operation timed out. Terminating. [ 68.460383] systemd[1]: log2ram.service: Main process exited, code=killed, status=15/TERM [ 68.460782] systemd[1]: log2ram.service: Failed with result 'timeout'.

df gives the following output: log2ram 102400 51004 51396 50% /var/log

after running root@p1mon:/usr/local/bin# sudo bash -x log2ram start

sent 51,922,694 bytes received 4,479 bytes 34,618,115.33 bytes/sec total size is 51,890,034 speedup is 1.00

df gives: log2ram 102400 51004 51396 50% /var/hdd.log log2ram 102400 51016 51384 50% /var/log

This seems wrong is and is possible the problem version installed with apt log2ram/stable,now 1.6.0 all [installed]

This version on an other Rpi does work log2ram/stable,now 1.5.2 all [installed] and gives only this mount point log2ram 81920 16836 65084 21% /var/log

wilwet commented 3 years ago

Update changed to version 1.5.2 and manual install and it works again.

azlux commented 3 years ago

Wait what ? The 2 changes into the code is the use of findmnt and command -v . How 1.5.2 have fix your issue ? Have you rsync installed ?

wilwet commented 3 years ago

I use rsync (rsync version 3.1.3 protocol version 31). I copied the 1.5.2 script into /usr/local/bin. Did the systemctl commands

sudo systemctl reload daemon sudo systemctl restart log2ram

This works fine. I used the standard log2ram.conf that worked and changed this to my custom version and both work with 1.5.2. version. I also use a custom unit file. But the default unit file does not work under 1.6 it does under 1.5.2.

This is custom log2ram.conf.

# Configuration file for Log2Ram (https://github.com/azlux/log2ram) under MIT license.
# This configuration file is read by the log2ram service

# Size for the ram folder, it defines the size the log folder will reserve into the RAM.
# If it's not enough, log2ram will not be able to use ram. Check you /var/log size folder.
# The default is 40M and is basically enough for a lot of applications.
# You will need to increase it if you have a server and a lot of log for example.
SIZE=100M

# If there are some errors with available RAM space, a system mail will be send
# Change it to false and you will have only a log if there is no place on RAM anymore.
MAIL=true

# Variable for folders to put in RAM. You need to specify the real folder `/path/folder` , the `/path/hdd.folder` will be automatically created. Multiple path can be separeted by `;`. Do not add the final `/` !
# example : PATH_DISK="/var/log;/home/test/FolderInRam"
PATH_DISK="/var/log"

# **************** Zram backing conf  *************************************************

# ZL2R Zram Log 2 Ram enables a zram drive when ZL2R=true ZL2R=false is mem only tmpfs
ZL2R=false
# COMP_ALG this is any compression algorithm listed in /proc/crypto
# lz4 is fastest with lightest load but deflate (zlib) and Zstandard (zstd) give far better compression ratios
# lzo is very close to lz4 and may with some binaries have better optimisation
# COMP_ALG=lz4 for speed or Zstd for compression, lzo or zlib if optimisation or availabilty is a problem
COMP_ALG=lz4
# LOG_DISK_SIZE is the uncompressed disk size. Note zram uses about 0.1% of the size of the disk when not in use
# LOG_DISK_SIZE is expected compression ratio of alg chosen multiplied by log SIZE
# lzo/lz4=2.1:1 compression ratio zlib=2.7:1 zstandard=2.9:1
# Really a guestimate of a bit bigger than compression ratio whilst minimising 0.1% mem usage of disk size
LOG_DISK_SIZE=100M

custom unit file:

[Unit]
Description=Log2Ram
DefaultDependencies=no
Before=basic.target rsyslog.service syslog-ng.service syslog.target systemd-journald.service sysinit.target shutdown.target zram-swap-conf.service apache2.service lighttpd.service nginx.service
After=local-fs.target
Conflicts=shutdown.target reboot.target halt.target
RequiresMountsFor=/var/log /var/hdd.log
IgnoreOnIsolate=yes

[Service]
Type=oneshot
ExecStartPre=-/bin/mkdir -p /var/log/nginx /var/log/samba /var/log/exim4
ExecStart= /usr/local/bin/log2ram start
ExecStop= /usr/local/bin/log2ram stop
ExecReload= /usr/local/bin/log2ram write
TimeoutStartSec=60
RemainAfterExit=yes

[Install]
WantedBy=sysinit.target
ejonesnospam commented 3 years ago

I've experienced the same issue, with a timeout at boot. But only at boot, and only if I add an additional path element to the default 'PATH_DISK' item in logram.conf, such as: PATH_DISK="/var/log;/home/bla/bla/bla" The default PATH_DISK element works fine at boot.

Confirmed size allocations are acceptable (300M alloc, 32M used). After boot, I can restart the log2ram service and multiple paths (3+) work fine.

In my investigation, I did notice a discrepancy in the apt installed log2ram version and the log2ram.conf currently published on github. This relates to a field listed as "USE_RSYNC" in log2ram.conf. This value is not actually used in the log2ram script thru apt or github, but was essential for my fix. Once I corrected the log2ram script to use this value, such as:

if [ "$USE_RSYNC" = true ]; then

which would replace: if [ -x "$(command -v rsync)" ]; then

I could then adequately force the script to use 'cp' over 'rsync', particularly at boot. This then allowed log2ram to start successfully at boot

Unfortunately, for my use case, this means I must live the cp command overhead and inefficiency, or adopt the script to only use cp at startup

Notes: -- tried adding delays at the start of log2ram with no success. -- based on adding a log file to the rsync command >> it does just stop copying and never returns -- OS: Raspbian GNU/Linux 10 (buster), rsync version 3.1.3 protocol version 31 -- also built latest rsync version, Rsync version 3.2.3, same issue

azlux commented 3 years ago

@ejonesnospam look like your issue is a different one. @wilwet have no additionnal path to the configuration file. For the $USE_RSYNC, it's indded not used anymore. What is your "fix" you're speaking about ?

@wilwet : I see /var/hdd.log is into your df command. It's should not. Can you dump the output of mount ?

wilwet commented 3 years ago

As requested, this output from the manual installation, the apt version failed.

Linux p1mon 5.4.72-v7+ #1356 SMP Thu Oct 22 13:56:54 BST 2020 armv7l GNU/Linux Description: Raspbian GNU/Linux 10 (buster) Release: 10 Codename: buster

/dev/mmcblk0p2 on / type ext4 (rw,noatime) devtmpfs on /dev type devtmpfs (rw,relatime,size=465596k,nr_inodes=116399,mode=755) sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) proc on /proc type proc (rw,relatime) securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) tmpfs on /run type tmpfs (rw,nosuid,nodev,mode=755) tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate) cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd) none on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700) cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio) cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct) cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) mqueue on /dev/mqueue type mqueue (rw,relatime) debugfs on /sys/kernel/debug type debugfs (rw,relatime) sunrpc on /run/rpc_pipefs type rpc_pipefs (rw,relatime) systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=34,pgrp=1,timeout=0,minproto=5,maxproto=5,direct) configfs on /sys/kernel/config type configfs (rw,relatime) tmpfs on /p1mon/mnt/ramdisk type tmpfs (rw,nosuid,noatime,size=102400k,mode=777) /dev/mmcblk0p1 on /boot type vfat (rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,errors=remount-ro) /dev/mmcblk0p2 on /var/hdd.log type ext4 (rw,noatime) log2ram on /var/log type tmpfs (rw,nosuid,nodev,noexec,noatime,size=102400k,mode=755) tmpfs on /run/user/1001 type tmpfs (rw,nosuid,nodev,relatime,size=99772k,mode=700,uid=1001,gid=1004)

pojlFDlxCOvZ4Kg8y1l4 commented 3 years ago

The apt version also fails on boot with a timeout over here.

dmesg | grep log2ram [ 144.676953] systemd[1]: log2ram.service: Start operation timed out. Terminating. [ 144.677289] systemd[1]: log2ram.service: Main process exited, code=killed, status=15/TERM [ 145.178520] systemd[1]: log2ram.service: Failed with result 'timeout'.

A simple "systemctl start log2ram" fixes it.
A reinstall of the newest version (1.6.0) shows the same behaviour, and I only installed this a few days ago.

I'm running Proxmox, here's my system info:

Linux pve1 5.4.106-1-pve #1 SMP PVE 5.4.106-1 (Fri, 19 Mar 2021 11:08:47 +0100) x86_64 GNU/Linux
pve-manager/6.4-6/be2fa32c (running kernel: 5.4.106-1-pve)

If I would need to take a guess where it's going wrong, then I'd say it's starting too early in the boot process.
Would it matter that much if you'd just say to start it when bootup is complete, since it rsyncs the directory it's pointed at?

Additional note:

A simple systemctl restart systemd-journald did not work - it would automatically go to /run/log even though storage was set to persistent. Persistent uses /var/log/, the default option (auto) does not. After a bit of searching, the commandos mkdir -p /var/log/journal systemd-tmpfiles --create --prefix /var/log/journal systemctl restart systemd-journald
worked, systemd-journald honored the wish to log to /var/log/.
Im pretty sure worked it before, but I do not have solid proof it actually wrote into /var/log/.

Got the instructions from this page