raspiblitz / raspiblitz

Get your own Bitcoin & Lightning Node running - on a RaspberryPi with a nice LCD
MIT License
2.44k stars 520 forks source link

Fatpack image hangs on first setup reboot #4469

Closed rootzoll closed 7 months ago

rootzoll commented 7 months ago

Got the rc4 fatpack from 11-March-24 that I created manual on a real raspberrpi ... and after creating an img from it (disk writer, pishrink, gz) when I make an sd card from it - it hangs on first setup reboot with this screen:

signal-2024-03-12-163518

After several hours it continues - but thats a showstopper. Also was reproducable if I repeated the build process.

This is what chatGPT says about the screen:

The image shows a Raspberry Pi's screen with a system log that appears to be from a shutdown or reboot process. The system has stopped various services and removed targets as part of the shutdown sequence, which is normal. However, the last message states "Job redis-server.service/save the DB, can't exit."

This message suggests that the Redis server, which is an in-memory data structure store used as a database, cache, and message broker, is having trouble saving the database to disk. When Redis can't exit properly, it typically means that it's unable to write its in-memory data to the disk, possibly due to permission issues, disk space issues, or problems with the Redis configuration.

Here are some potential causes and corresponding solutions:

  1. Disk Space: Ensure there is enough disk space on the device for Redis to write its persistence files.

  2. Permissions: Check that the Redis process has the necessary permissions to write to the directory where it stores its persistence files.

  3. Redis Configuration: Verify that the Redis configuration file (redis.conf) is properly set up, especially the dir directive, which specifies the working directory for the database dump files.

  4. Faulty Shutdown Script: If there is a custom shutdown script for Redis, ensure it is functioning correctly and giving Redis enough time to save the database before shutting down.

  5. File System Errors: There could be file system errors preventing Redis from saving its database. Check the system logs for any file system related errors.

  6. External Storage Issues: If Redis is configured to save its data to an external storage device, make sure the device is connected and functioning properly.

  7. Corruption: In rare cases, the Redis database files might be corrupt. Look at Redis logs for any messages about corruption.

To prevent this from happening in the future, you can:

  • Set up proper monitoring for disk space and system health.
  • Use Redis's built-in BGSAVE command periodically to ensure that the database is saved in the background without waiting for a shutdown or reboot.
  • Regularly check file system integrity and permissions.
  • Review and test the Redis configuration and any associated scripts for proper operation.

If you have access to the command line while the system is running, you can also try to manually save the Redis database with the command redis-cli bgsave to see if there are any errors during the operation.

openoms commented 7 months ago

Poking in the dark, but can you try using another SDcard?

rootzoll commented 7 months ago

I tested with another sd card - thats not it. I tested to make redis explicit "stop-writes-on-bgsave-error no" in config - does not change it.

Next ideas:

rootzoll commented 7 months ago

Interesting: The reboot that hangs is not the normal bootstrap setup reboot ... it gets triggered out of plan during the repair of the SSH certs at the beginning of the bootstrap. Investigating that deeper.

rootzoll commented 7 months ago

Working on branch 4469-hang ... first reboot happend on restart of sshd service. Now on dpkg-reconfigure openssh-server see https://github.com/raspiblitz/raspiblitz/blob/3d652d838ae84bf02ca28825ffcfb557ac3b281c/home.admin/config.scripts/blitz.ssh.sh#L45

***********************************************
Running RaspiBlitz Bootstrap 1.11.0rc4
Wed 13 Mar 22:04:39 GMT 2024
***********************************************
# init SSH KEYS fresh for new user
# *** /home/admin/config.scripts/blitz.ssh.sh init
# generate new keys
ssh-keygen: generating new host keys: RSA ECDSA ED25519
# reconfigure
rootzoll commented 7 months ago

This is getting silly. So I disabled now redis on release command and reactivate it after the SSH certs get initiated and reloaded on first boot. Now the first boot gets thru the SSH stuff but does later a reboot after redis was activated again and still hangs on redis shutting down wis inital error above.

***********************************************
Running RaspiBlitz Bootstrap 1.11.0rc4
Thu 14 Mar 09:49:53 GMT 2024
***********************************************
# init SSH KEYS fresh for new user
# *** /home/admin/config.scripts/blitz.ssh.sh init
# generate new keys
ssh-keygen: generating new host keys: RSA ECDSA ED25519
# reconfigure
# remove flag
# restart sshd
# make sure redis is running
## prepare raspiblitz temp
## INIT raspiblitz.info
baseimage=raspios_arm64
cpu=aarch64
blitzapi=on
displayClass=lcd
displayType=
setupPhase=boot
setupStep=0
fsexpanded=0
state=starting
btc_mainnet_sync_initial_done=0
btc_testnet_sync_initial_done=0
btc_signet_sync_initial_done=0
ln_lnd_mainnet_sync_initial_done=0
ln_lnd_testnet_sync_initial_done=0
ln_lnd_signet_sync_initial_done=0
ln_cl_mainnet_sync_initial_done=0
ln_cl_testnet_sync_initial_done=0
ln_cl_signet_sync_initial_done=0
## INIT RaspiBlitz Cache ... wait background.scan.service to finish first scan loop
- waiting for background.scan.service --> systemscan_runtime(1)
No Wifi config by file on sd card.

So either its redis itself triggering a reboot during its startup or its an outside process (maybe background_scan, etc). This is a time burner to debug, but I keep digging.

rootzoll commented 7 months ago

OK looks like I got it - hope so. Will make clean test tomorrow.

rootzoll commented 7 months ago

Runs now thru. Closing issue.