framps / raspiBackup

Create and keep multiple backup versions of your running Raspberries
https://raspibackup.linux-tips-and-tricks.de
GNU General Public License v3.0
794 stars 75 forks source link

Restored image doesn't start #637

Closed FJustinian closed 1 year ago

FJustinian commented 1 year ago

Hi,

since update from version 0.67 to 0.68 I receive a second mail from Cron Demon with following content:


256+0 Datensätze ein 256+0 Datensätze aus 1+0 Datensätze ein 1+0 Datensätze aus tar: Entferne führende „/“ von Elementnamen tar: /var/lib/samba/private/msg.sock/17860: Socket ignoriert tar: /var/lib/samba/private/msg.sock/596: Socket ignoriert tar: /var/lib/samba/private/msg.sock/17861: Socket ignoriert tar: Entferne führende „/“ von Zielen harter Verknüpfungen


The same content appears in the regular backupmail btw. A restore of the backup did not start and since I am using the raspi headless I am not able to see the starting screen right now.

If you need more information just tell me :-).

Kind regards

framps commented 1 year ago

I'm sorry - I just deleted by error your log files :cry: Would you please add them again?

FJustinian commented 1 year ago

I'm sorry - I just deleted by error your log files 😢 Would you please add them again?

Of course! raspiBackup_restore.log raspiBackup.log

framps commented 1 year ago

I just detected I can view the edit history and I was able to recover the logs :wink:

framps commented 1 year ago

I detected the root cause for the additional mail (See new issue #638 I created). A workaround is to configure your Raspberry to run in English instead of German.

But this shouldn't be the root cause for the boot failure of the restored system.

I checked both logs and there I don't see anything special there.

You told me you detected you can restore and boot an older backup created with 0.6.7 . Would you please open the debug log of this backup and let me know the exact code level. I need following line right at the top of the debuglog:

20230319-050002 DBG 2079: --- raspiBackup.sh 0.6.8, 2023-02-09/14:05:24 - 4b9c22a

FJustinian commented 1 year ago

Eh voila!

20221204-050002 DBG 2004: --- raspiBackup.sh 0.6.7, 2022-05-26/19:34:12 - 9edf641

Am 19.03.2023 um 13:41 schrieb framp @.***>:

I detected the root cause for the additional mail (See new issue #638 https://github.com/framps/raspiBackup/issues/638 I created). A workaround is to configure your Raspberry to run in English instead of German.

But this shouldn't be the root cause for the boot failure of the restored system.

I checked both logs and there I don't see nothing special there.

You told me you detected you can restore and boot an older backup created with 0.6.7 . Would you please open the debug log of this backup and let me know the exact code level. I need following line right at the top of the debuglog:

20230319-050002 DBG 2079: --- raspiBackup.sh 0.6.8, 2023-02-09/14:05:24 - 4b9c22a

— Reply to this email directly, view it on GitHub https://github.com/framps/raspiBackup/issues/637#issuecomment-1475242332, or unsubscribe https://github.com/notifications/unsubscribe-auth/AWDFSNBDBHE72MQ5M6BJFTDW435H3ANCNFSM6AAAAAAWAAWHTY. You are receiving this because you authored the thread.

FJustinian commented 1 year ago

Btw. to rule out a sd-card error I used a new one but without an improvement.

Right now I changed my current system back to 0.67 and restored a backup made two weeks ago also with 0.67. The restored backup started normal.

framps commented 1 year ago

Right now I changed my current system back to 0.67 and restored a backup made two weeks ago also with 0.67. The restored backup started normal.

That's why raspiBackup has a restore feature :wink:

I checked the logs for some common issues which cause the system to fail during startup. No findings :disappointed: . The only way to locate the root cause is to get the boot messages.

Whenever I have an issue with my headless Raspberries I plug in the SD card on another Raspberry I have and connect a screen.

Could you check on the SD card of the restored system /var/log/kern.log? As far as I understand you may be able to get the bootlog there (according this thread) You have to use a Linux system in order to access the root partition. Windows will not be able to access the partition. Just use a Raspberry :wink:

If this doesn't work I see no other way to locate the root cause with a screen connected to your Raspberry :cry:

I will setup now a buster in German and create a tar backup and restore this backup to double check this issue is special for you.

FJustinian commented 1 year ago

Could you check on the SD card of the restored system /var/log/kern.log? As far as I understand you may be > able to get the bootlog there (according this thread) You have to use a Linux system in order to access > the root partition. Windows will not be able to access the partition. Just use a Raspberry 😉

Did you mean from the current working system or from the issued system? I attach you from the current working system two files: kern.log and kern.log.1. kern.log.1.log kern.log

I assume the logs from the issued system won't do the job since you needed them after a boot.

If this doesn't work I see no other way to locate the root cause with a screen connected to your Raspberry 😢

Yes I thought about that too, but right now I haven't a micro HDMI cable, but I will order one.

I will setup now a buster in German and create a tar backup and restore this backup to double check this issue is special for you.

Thanks a lot. If there is anything I can do tell me. I thought about to change the current system to english -although this seems not the problem- and update to 0.6.8 and start a backup/restore.

Best

framps commented 1 year ago

I tried to recreate your issue on a German buster with tar and option --createUUIDs and the restored system boots successfully. I don't know why you use option --createUUIDs to force a modification of the existing UUIDs in the backup. It's only required in some special situations when you want to mount the original system together with the restored system because then you will have the identical UUIDs used twice which causes boot issues.

I assume the logs from the issued system won't do the job since you needed them after a boot.

I just forced a boot error and yes - there is nothing in there :-(

If there is anything I can do tell me.

Please buy a HDMI cable and check the boot messages so we find the root cause of your issue. I don't sleep very well if I know there may be a serious issue in raspiBackup :wink:

FJustinian commented 1 year ago

I am using the --updateUUIDs argument since you recommend me this after an issue 😉. I am not sure anymore what the issue was. Maybe I tried to establish an usb emergency stick in case the pi won't boot from the sd card?

Yes I already ordered a cable and will check the boot messages. Thank you very much so far!!!

framps commented 1 year ago

Maybe I tried to establish an usb emergency stick in case the pi won't boot from the sd card?

Ah ... I remember there was an issue because the duplicate UUIDs somewhere in the past. You still can use this option. But in a strict sense your restored system is no longer identical to the backup because different UUIDs are used.

Yes I already ordered a cable and will check the boot messages.

:+1:

FJustinian commented 1 year ago

Yes I already ordered a cable and will check the boot messages.

Here are two pictures with the results. Unfortunately I don't have a keyboard to press enter.

I forgot to test it but I assume at this stage I can't login via my computer but I hope you'll get some hints regarding the issue.

IMG_0449 IMG_0450

framps commented 1 year ago

Thank you very much for the screenshots. I now have an idea where I have to check the restore debug log more carefully :wink:

framps commented 1 year ago

This error usually happens if there is something wrong with /etc/fstab. I checked the logs carefully but as far as I can see everything was updated correctly. Actually there were also no changes in the UUID update code between 0.6.7 and 0.6.8.

Would you please plug in the restored flash drive into a Raspberry and execute following steps?

1) Show me the output of blkid 2) Mount the boot partition of the flash drive and show me the contents of cmdline.txt 3) Mount the root partition of the flash drive and show me the contents of /etc/fstab

FJustinian commented 1 year ago

Good evening framp!

#637 outputs.txt

framps commented 1 year ago

Thank you very much for the file. Unfortunately this doesn't match to what I found in the restore debug log:

20230319-103958 DBG 7916:                          *** blkid
20230319-103958 DBG 7916:                              /dev/sdb1: LABEL_FATBOOT="boot" LABEL="boot" UUID="81D2-2FB9" TYPE="vfat" PARTUUID="8fd58299-01"
20230319-103958 DBG 7916:                              /dev/sdb2: UUID="576e8d76-9bbc-44dd-8fd8-07eec11510d1" TYPE="ext4" PARTUUID="8fd58299-02"
20230319-103958 DBG 7916:                              /dev/sda1: LABEL_FATBOOT="EFI" LABEL="EFI" UUID="67E3-17ED" TYPE="vfat" PARTLABEL="EFI System Partition" PARTUUID="229c0f47-9f10-490f-a6d2-b9c18b2a4954"
20230319-103958 DBG 7916:                              /dev/sda2: LABEL="PlayDaMowie" UUID="5FE1-CA87" TYPE="exfat" PARTLABEL="Elements" PARTUUID="485a7e03-1dd7-4220-a07b-6e845d4938f7"
20230319-103958 DBG 7916:                              /dev/sdc1: LABEL_FATBOOT="boot" LABEL="boot" UUID="C0D8-EE6C" TYPE="vfat" PARTUUID="92de8469-01"
20230319-103958 DBG 7916:                              /dev/sdc2: UUID="1ae90d38-7dbc-478c-a0e5-794f698ca431" TYPE="ext4" PARTUUID="92de8469-02"

According the restore debug log during restore the UUIDs 92de8469-02 and 92de8469-01 were generated. But I don't see this UUID in your blkid output :thinking: .

In addition you listed /etc/fstab and /boot/cmdline from your running system :wink: . I need them from the restored system. You have to insert the restored SD card/flash drive in your Raspberry and execute

blkid; sudo mount /dev/sdx1 /mnt; cat /mnt/cmdline.txt; sudo umount /mnt; sudo mount /dev/sdx2 /mnt; cat /mnt/etc/fstab; sudo umount /mnt

where x is the drive the restored system was discovered on your raspberry.

FJustinian commented 1 year ago

My first thought was that you meant the broken system, but I didn't ask 😅 .

#637 outputs.txt

framps commented 1 year ago

Thank you very much. These outputs proof the debug statements I checked are correct. I frankly have no idea any more where your issue comes from :cry:

The only way I see right now to make progress is to connect a keyboard and check the journal for error messages.

FJustinian commented 1 year ago

I`ll borrow a USB keyboard and get back to you in a few days.

Thanks for looking into the issue!!

framps commented 1 year ago
UUID=5FE1-CA87 /media/usbdrive exfat defaults,noauto,x-systemd.automount,umask=000,users,rw 0 0

Did you plug in this USB disk when you started your restored system?

FJustinian commented 1 year ago

Did you plug in this USB disk when you started your restored system?

Yes, this is my external media and backup drive. The drive is always attached btw. Has it to do with the power consumption? I thought about that too, but since booting with the drive attached worked flawlessly I didn't give it much attention.

Best!!

framps commented 1 year ago

Has it to do with the power consumption?

I don't think so. I'm still thinking why you have this trouble. Let's wait for the keyboard and check the boot log.

FJustinian commented 1 year ago

Hi framp,

finally I've got a keyboard! Not easy these days :-).

So after pressing enter the boot continued till the prompt, but I saw a failed message. I logged in and rebootet the system but now the raspi booted correctly.

I restored the problematic backup again and was able to make a picture as attached.
There is a problem with the attached USB Drive and it seems you are on the right track. But again after hitting enter the raspi continued till the prompt and the reboot was without a problem.

Hope this is helpfull!!

IMG_0500

framps commented 1 year ago

finally I've got a keyboard!

:+1:

I logged in and rebootet the system but now the raspi booted correctly.

Can you now start the system (power on) from the restored image or do you have to press enter every time you start the system (power on).

Looks like it's related to following line in your /etc/fstab where automount is used.

#usbplatte 2TB
UUID=5FE1-CA87 /media/usbdrive exfat defaults,noauto,x-systemd.automount,umask=000,users,rw 0 0

Did you execute systemctl status media-usbdrive.automount ? What's the output?

I think the boot failure will be gone if you add nofail to the /etc/fstab definition above.

FJustinian commented 1 year ago

No, it's just a 'one time press the enter button' during the first start. I guess the OS corrected the problem and started normal. I added the nofail argument and checked the minidlna.conf but here's nothing unusual.

I'll update raspiBackup to 0.6.8 run a test and let you know how it went.

Here is the output from systemctl status media-usbdrive.automount:

IMG_0501

framps commented 1 year ago

Thank you very much for the screenshot. Sounds to me there is some issue with systemd automount.

It's also interesting the system boots successfully when you pressed enter when the system boots the first time and then there is no issue any more :thinking:

Do you mind if I create a thread in the German Raspberry forum and refer to this issue? I'm not familiar with systemd and automount. Maybe some of the very experienced members in the forum can help :wink:

FJustinian commented 1 year ago

Good Morning.

as planed I updated raspiBackup from 0.6.7 to 0.6.8 with the nofail argument in fstab. I backuped the updated version and restored it again. The restored raspi booted without problems.

I'm running the restored version a few days, carry out a second backup/restore test and get back to you. I also have now two snapshots 0.6.7 and 0.6.8. If you'll get a hint from the raspberry forum we can run a further test.

Thanks so far!!!

FJustinian commented 1 year ago

Hi framp,

the restored version was backuped regulary with just one cron message afterwards: tar: /var/lib/samba/private/msg.sock/610: socket ignored

I restored the newest backup again ant the system booted without any problems.

I now reordered the stopping and starting services and put cron and samba to the end but you already mentioned that just changing the software language to English would solve the cron message mail. Just in case ;-).

A bit unsatisfying right now I know. If I notice something else I will contact you.

Kind regards!

framps commented 1 year ago

Did you upgrade to the latest code level with sudo raspiBackup -U -S ? I already fixed the language issue. You now should be able to use German again.

FJustinian commented 1 year ago

Did you upgrade to the latest code level with sudo raspiBackup -U -S ?

I upgraded to 0.6.8 last friday just with the -U argument but will do a new upgrade!

Edit I'm getting an error message: ??? RBK0239E: https://www.linux-tips-and-tricks.de/raspiBackup/raspiBackup0613.properties kann nicht aus dem Netz geladen werden. HTTP code: 000. RC: 28

framps commented 1 year ago

Don't forget to use option -S

FJustinian commented 1 year ago

Don't forget to use option -S

I did it with the -S

pi@himberrypi:~ $ sudo raspiBackup -U -S --- RBK0031I: Prüfe ob eine neue Version von raspiBackup.sh verfügbar ist. ??? RBK0239E: https://www.linux-tips-and-tricks.de/raspiBackup/raspiBackup0613.properties kann nicht aus dem Netz geladen werden. HTTP code: 000. RC: 28 --- RBK0026I: Debug Logdatei wurde in /home/pi/raspiBackup.log gesichert.

framps commented 1 year ago

man curl

28     Operation timeout. The specified time-out period was reached according to the conditions.

Please check your internet connection. Everything works fine for me.

FJustinian commented 1 year ago

man curl

28     Operation timeout. The specified time-out period was reached according to the conditions.

Please check your internet connection. Everything works fine for me.

Hmmh! Internet is running. I already pinged your homepage, got a response. I have pihole running but this shouldn't be the problem.

Edit Tested it with curl: pi@himberrypi:~ $ curl https://www.linux-tips-and-tricks.de > test % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- 0:00:05 --:--:-- 0 Strange!

edit edit:

pi@himberrypi:~ $ curl https://www.linux-tips-and-tricks.de/raspiBackup/raspiBackup0613.properties > test.properties % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1536 100 1536 0 0 298 0 0:00:05 0:00:05 --:--:-- 327

Ok I'm getting a response.

framps commented 1 year ago

Weird.

Please try

curl https://www.linux-tips-and-tricks.de/raspiBackup/raspiBackup0613.properties
FJustinian commented 1 year ago

curl https://www.linux-tips-and-tricks.de/raspiBackup/raspiBackup0613.properties

Ok got the property file, but still no response from the update.

framps commented 1 year ago

Hm ... maybe you have a slow internetconnection.

Please change DOWNLOAD_TIMEOUT=5 # seconds

in raspiBackup into

DOWNLOAD_TIMEOUT=60 # seconds

FJustinian commented 1 year ago

Hm ... maybe you have a slow internetconnection.

Please change DOWNLOAD_TIMEOUT=5 # seconds

in raspiBackup into

DOWNLOAD_TIMEOUT=60 # seconds

No my Internet speed is fine. I was just streaming a movie. I have to ask.

DOWNLOAD_TIMEOUT=60 # seconds

is this to edit in the raspibackup.conf ?

framps commented 1 year ago

No. That's a line of code in /usr/local/bin/raspiBackup.sh

FJustinian commented 1 year ago

Ok. That did it!! Speedtest was good though.

Thanks!

framps commented 1 year ago

I increased the timeout again to 60s. Was no good idea to reduce it to 5s :-(

github-actions[bot] commented 1 year ago

This issue is considered stale now and will be closed in 1 week if there is no activity any more

FJustinian commented 1 year ago

Hi framp,

after updating to the latest version (sudo raspiBackup -U -S) the regular backup worked fine, but with a lot of

tar: /var/lib/samba/private/msg.sock/706: socket ignored

messaages within the mail and also in a additional cron mail. Restore worked fine though.

I changed the language to English and today I got the same messages. Just for you to know.

Thank you very much!

Best

github-actions[bot] commented 1 year ago

This issue is considered stale now and will be closed in 1 week if there is no activity any more