EOA-Blockchain-Labs / ethereumonarm

Tools and scripts to build images that turn ARM devices into ful Ethereum nodes
GNU General Public License v3.0
143 stars 35 forks source link

Missing journalctl logs #42

Open yahgwai opened 2 years ago

yahgwai commented 2 years ago

Sync has been running on my Raspberry Pi 4 8GB with Samsung 2TB T7 for almost 4 days now. Geth fell over after about 6 hours of syncing, which I noticed in the morning and restarted it. It completed the head and state download and has now been on the state healing phase for about 32 hours.

The journalctl logs seem to be truncate regularly, they only hold about 9 hours of geth logs. Looking in the log.hdd directory I found more logs, but these have large gaps in them, which meant I was unable to find any logs around the time when geth crashed:

Aug 26 21:22:46 ethereumonarm-2c6005e5d geth[974]: INFO [08-26|19:22:46.298] Imported new block headers               count=0    elapsed=66.770ms  number=14,681,951 hash=088fd1..5830d8 age=3mo4w20h   ignored=192
Aug 26 21:22:46 ethereumonarm-2c6005e5d geth[974]: INFO [08-26|19:22:46.588] Imported new block headers               count=0    elapsed=35.808ms  number=14,682,143 hash=9cb765..80dbed age=3mo4w19h   ignored=192
Aug 26 21:22:46 ethereumonarm-2c6005e5d geth[974]: INFO [08-26|19:22:46.592] Imported new block receipts              count=24   elapsed=358.446ms number=14,678,445 hash=770680..3ea87a age=3mo4w1d    size=1.31MiB
Aug 26 21:22:47 ethereumonarm-2c6005e5d geth[974]: INFO [08-26|19:22:47.533] Imported new block receipts              count=23   elapsed=939.160ms number=14,678,468 hash=f40945..6939d6 age=3mo4w1d    size=1.21MiB
Aug 26 21:22:47 ethereumonarm-2c6005e5d geth[974]: INFO [08-26|19:22:47.656] Imported new block headers               count=0    elapsed=729.754ms number=14,682,335 hash=d3e2cf..1d1605 age=3mo4w19h   ignored=192
Aug 27 17:00:20 ethereumonarm-2c6005e5d geth[24835]: INFO [08-27|15:00:20.514] Imported new block receipts              count=19   elapsed=131.671ms   number=15,420,229 hash=bc9e30..0f194e age=7h39m52s  size=2.22MiB
Aug 27 17:00:21 ethereumonarm-2c6005e5d geth[24835]: INFO [08-27|15:00:21.468] Imported new block headers               count=192  elapsed=70.297ms    number=15,421,720 hash=e79203..d934dc age=1h55m36s
Aug 27 17:00:21 ethereumonarm-2c6005e5d geth[24835]: INFO [08-27|15:00:21.667] Imported new block receipts              count=17   elapsed=274.702ms   number=15,420,246 hash=01683f..74c5dd age=7h36m51s  size=3.30MiB

Looking in other logs it looks like the log system may be running out of space sometimes. Any idea why that might be? Could a space issue have caused geth to crash, and failing to complete syncing?

Aug 27 02:00:04 ethereumonarm-2c6005e5d systemd[1]: Starting Rotate log files...
Aug 27 02:00:04 ethereumonarm-2c6005e5d systemd[1]: Starting Daily man-db regeneration...
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20874]: Sat Aug 27 00:00:04 UTC 2022: Syncing logs to storage
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: sending incremental file list
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: ./
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: alternatives.log
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: armbian-hardware-monitor.log
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: auth.log
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: boot.log
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: btmp
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: cloud-init-output.log
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: cloud-init.log
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: dpkg.log
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: kern.log
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: lastlog
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: syslog
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: ubuntu-advantage-timer.log
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: wtmp
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: apt/
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: apt/eipp.log.xz
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: apt/history.log
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: apt/term.log
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: grafana/
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: grafana/grafana.log
Aug 27 02:00:04 ethereumonarm-2c6005e5d armbian-ramlog[20878]: grafana/grafana.log.2022-08-27.001
Aug 27 02:00:07 ethereumonarm-2c6005e5d armbian-ramlog[20916]: cat: write error: No space left on device
Aug 27 02:00:07 ethereumonarm-2c6005e5d armbian-ramlog[20917]: cat: write error: No space left on device
Aug 27 02:00:07 ethereumonarm-2c6005e5d armbian-ramlog[20922]: cat: write error: No space left on device
Aug 27 02:00:07 ethereumonarm-2c6005e5d armbian-ramlog[20923]: cat: write error: No space left on device
Aug 27 02:00:07 ethereumonarm-2c6005e5d armbian-ramlog[20924]: cat: write error: No space left on device
Aug 27 02:00:07 ethereumonarm-2c6005e5d armbian-ramlog[20925]: cat: write error: No space left on device
Aug 27 02:00:07 ethereumonarm-2c6005e5d armbian-ramlog[20927]: cat: write error: No space left on device
Aug 27 02:00:07 ethereumonarm-2c6005e5d armbian-ramlog[20928]: cat: write error: No space left on device
Aug 27 02:00:07 ethereumonarm-2c6005e5d armbian-ramlog[20936]: cat: write error: No space left on device
Aug 27 02:00:07 ethereumonarm-2c6005e5d rsyslogd[900]:  message repeated 3 times: [[origin software="rsyslogd" swVersion="8.2001.0" x-pid="900" x-info="https://www.rsyslog.com"] rsyslogd was HUPed]
Aug 27 02:00:07 ethereumonarm-2c6005e5d rsyslogd[900]: file '/var/log/syslog'[7] write error - see https://www.rsyslog.com/solving-rsyslog-write-errors/ for help OS error: No space left on device [v8.2001.0 try>
Aug 27 02:00:07 ethereumonarm-2c6005e5d rsyslogd[900]: action 'action-3-builtin:omfile' (module 'builtin:omfile') message lost, could not be processed. Check for additional error messages before this one. [v8.2>
Aug 27 02:00:07 ethereumonarm-2c6005e5d rsyslogd[900]: file '/var/log/syslog'[7] write error - see https://www.rsyslog.com/solving-rsyslog-write-errors/ for help OS error: No space left on device [v8.2001.0 try>

here's the output from df:

Filesystem      Size  Used Avail Use% Mounted on
udev            3.8G     0  3.8G   0% /dev
tmpfs           782M  3.1M  779M   1% /run
/dev/mmcblk0p2   29G  6.7G   21G  25% /
tmpfs           3.9G     0  3.9G   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/loop2       61M   61M     0 100% /snap/lxd/21843
/dev/sda1       1.8T  582G  1.2T  34% /home
/dev/loop1       58M   58M     0 100% /snap/core20/1614
/dev/loop0       58M   58M     0 100% /snap/core20/1332
/dev/loop5       41M   41M     0 100% /snap/snapd/16299
/dev/loop3       38M   38M     0 100% /snap/snapd/14982
/dev/loop4       62M   62M     0 100% /snap/lxd/22761
/dev/mmcblk0p1  253M   61M  192M  24% /boot/firmware
/dev/zram2       73M  104K   68M   1% /tmp
/dev/zram1       49M   33M   12M  74% /var/log
tmpfs           782M     0  782M   0% /run/user/1001

Separately, I also noticed quite a few grafana warnings:

Aug 25 21:31:00 ethereumonarm-2c6005e5d grafana-server[975]: logger=ngalert t=2022-08-25T19:31:00.012582311Z level=warn msg="rule declares one or many reserved labels. Those rules labels will be ignored" labels=
"alertname=VALIDATOR: Hourly earning <= 0"
Aug 25 21:31:01 ethereumonarm-2c6005e5d grafana-server[975]: logger=ngalert t=2022-08-25T19:31:01.45294029Z level=warn msg="rule declares one or many reserved labels. Those rules labels will be ignored" labels="
alertname=VALIDATOR: Validator has been slashed"
Aug 25 21:31:02 ethereumonarm-2c6005e5d grafana-server[975]: logger=ngalert t=2022-08-25T19:31:02.890755983Z level=warn msg="rule declares one or many reserved labels. Those rules labels will be ignored" labels=
"alertname=WARN NODE/VALIDATOR: The process just restarted"
Aug 25 21:31:04 ethereumonarm-2c6005e5d grafana-server[975]: logger=ngalert t=2022-08-25T19:31:04.31123422Z level=warn msg="rule declares one or many reserved labels. Those rules labels will be ignored" labels="
alertname=NETWORK: Participation rate below 66%"
Aug 25 21:31:05 ethereumonarm-2c6005e5d grafana-server[975]: logger=ngalert t=2022-08-25T19:31:05.739122193Z level=warn msg="rule declares one or many reserved labels. Those rules labels will be ignored" labels=
"alertname=NODE/VALIDATOR: Process down"