MichaIng / DietPi

Lightweight justice for your single-board computer!
https://dietpi.com/
GNU General Public License v2.0
4.86k stars 496 forks source link

After 5 days of work Pi locked up and couldn't read/write from boot stick #5587

Closed AAS-Crypt closed 2 years ago

AAS-Crypt commented 2 years ago

Creating a bug report/issue

Required Information

Steps to reproduce

  1. I can't imagine how, today I reduced tasks to 1 which works every 15 minute, it's not that heavy compared to my last report.
  2. Expose to long uptime.

Expected behaviour

To work fine.

Actual behaviour

Raspberry Pi dead locked, without ability to log in, after loging in to Pi - locally. Prints out lines below. Interestingly it began behaving like this after 5 days of constant working.

Extra details

These are the logs that being printed on screen in ratio 1:4, being 1 proftpd to 4 zerotier. [495401.244418] EXT4-fs error (device sda2): ext4_find_entry:1614: inode #3161: comm proftpd: reading directory iblock 0 [495401.056099] EXT4-fs error (device sda2): ext4_find_entry:1614: inode #16: comm zerotier-one: reading directory iblock 0 After rebooting Pi I get this on my screen. WhatsApp Image 2022-07-02 at 16 20 42

AAS-Crypt commented 2 years ago

Got an video from booting Pi up without pre-installing USB stick to slot, and after the Boot image is shown connecting Boot stick.

Appears to me that some software cleaned up my volumes on boot driver. https://user-images.githubusercontent.com/77435348/176996836-aac6790d-7fd4-474c-aaba-9fd6e9b1f3e0.mp4

Now will be inspecting my USB stick.

AAS-Crypt commented 2 years ago

After inspectioning it I have to conclusion that USB stick is read-only. I would appreciate any help on this matter.

MichaIng commented 2 years ago

You have filesystem errors. Please do the following to try fixing them:

mount -o remount,rw /
> /forcefsck
reboot
# after reboot
journalctl -t systemd-fsck

EDIT: Ah, is sda2 the root/system drive or an external drive like for DietPi userdata? And if it's not the DataTraveler, is it a 2,5" drive, and if so, how is it powered?

AAS-Crypt commented 2 years ago

I have only DataTraveler. Will try rn. Maybe keyboard is locked only on first tty.

MichaIng commented 2 years ago

Ah, right, it's sda, so first/only USB drive, I had a little mind outage 😅.

Saw the video just now, looks like it wouldn't even reach a point where fsck can run. I think you need to attach and fix it on another Linux system them. Also check the content of the boot partition (first partition with FAT filesystem) then, which you can also do on Windows, of course, probably it contains FSCK* named files, fsck backups, which indicate that there have been found some corruption before.

While SD cards are known to wear fast, USB drives should be actually more robust 🤔. Is that a new stick or an older one?

AAS-Crypt commented 2 years ago

Pi printing out same response as on photo. Keyboard is locked, already tried to unlock usb driver in Windows, will try in linux later on.

AAS-Crypt commented 2 years ago

Saw the video just now, looks like it wouldn't even reach a point where fsck can run. I think you need to attach and fix it on another Linux system them. Also check the content of the boot partition (first partition with FAT filesystem) then, which you can also do on Windows, of course, probably it contains FSCK* named files, fsck backups, which indicate that there have been found some corruption before.

Yes I found FSCK0000.REC

While SD cards are known to wear fast, USB drives should be actually more robust 🤔. Is that a new stick or an older one?

Pretty new been bought in 2020

AAS-Crypt commented 2 years ago

Here is the contents of file : G_HW_MODEL=4 G_HW_MODEL_NAME='RPi 4 Model B (aarch64)' G_HW_ARCH=3 G_HW_ARCH_NAME='aarch64' G_HW_CPUID=0 G_HW_CPU_CORES=4 G_DISTRO=6 G_DISTRO_NAME='bullseye' G_ROOTFS_DEV='/dev/sda2' G_HW_UUID='3dd5552d-fa1e-47a7-a519-c05dd9bc1238' G_RASPBIAN=0 G_HW_ONBOARD_WIFI=1 G_HW_REVISION='d03114' G_HW_PCB_REVISION=4 G_HW_MEMORY_SIZE=8192 G_HW_MANUFACTURER='Sony UK'

MichaIng commented 2 years ago

Here is the contents of file :

Okay that's /boot/dietpi/.hw_model, that isn't critical and is recreated at reboot automatically. so you can remove that one.

Pretty new been bought in 2020

Was there a sudden power outage or unclean shutdown?

AAS-Crypt commented 2 years ago

Okay that's /boot/dietpi/.hw_model, that isn't critical and is recreated at reboot automatically. so you can remove that one.

Will try on Linux system, as Windows doesn't allow me to remove files from Driver.

Power there a sudden power outage or unclean shutdown?

Neither, rather 100 hours bottleneck, after which I wasn't able to interact with Pi neither via ssh or keyboard, after reboot it kept showing me that image from above

MichaIng commented 2 years ago

Okay, let's hope another Linux system can repair it. Let's then see whether it happens a second time, and if so, I suggest to enable persistent system logs to enable the ability to see what happened before the crash and when first filesystem or I/O errors happened.

AAS-Crypt commented 2 years ago

Okay, will try to implement it and keep updating information.

AAS-Crypt commented 2 years ago

Just to be sure typing "repair" you mean override readonly option and removing "FSCK0000.REC"?

AAS-Crypt commented 2 years ago

My system mounts image

Mounted both partitions in linux distro. First partition automatically mounted, but in order to make it writable I proceed with this command mount -o remount,rw /dev/sdb1 /media/sdb1 -command in question mount -o noload,ro /dev/sdb2 /media/sdb2 | I was able to mount second partition succesfully, but only in read only mode. -command in question mount -o remount,rw /dev/sdb2 /media/sdb2 | And then when I remount it with readwrite parameters, I get these lines respectively. fsck can succesfully run on sdb1 but aborts while running on sdb2

image

MichaIng commented 2 years ago

Just to be sure typing "repair" you mean override readonly option and removing "FSCK0000.REC"?

Especially I mean to run an fsck on both partitions from an external Linux system. For fsck on the second partition to run, it must not be mounted. What is the error message fack returns?

AAS-Crypt commented 2 years ago

While they are mounted: sdb1 : image sdb2 : image

While they are unmounted : sdb1 : image

sdb2: image

Joulinar commented 2 years ago

If possible try to avoid doing screen shots. You should be able to copy the content directly from SSH session

MichaIng commented 2 years ago

Please try:

hdparm -r0 /dev/sdb2
hdparm -r /dev/sdb2 # check back
fsck -p /dev/sdb2
AAS-Crypt commented 2 years ago

Strangly, clipboard is not working so I cannot insert here text from my VM. I see this problem as urgent so I will send text messages when I will figure out why It's not working. image

Joulinar commented 2 years ago

Is the drive mounted? I guess this is not necessary

//offtopic: Did you tried using SSH? There you should be able to copy the stuff 😃

AAS-Crypt commented 2 years ago

I'm currently using Host-to-Machine networking. Later on will change and try to ssh in.

Drive is required to be unmounted to be able to use fsck -p on it. While being mounted it will not be able to repair drive.

MichaIng commented 2 years ago

Is it the same (no write protection flag) for the whole disk?

hdparm -r /dev/sdb
hdparm -r0 /dev/sdb
AAS-Crypt commented 2 years ago
root@dhcppc10:/home/demo# hdparm -r /dev/sdb

/dev/sdb:
  readonly          =   1 (on)
root@dhcppc10:/home/demo# hdparm -r0 /dev/sdb

/dev/sdb:
  setting readonly to 0 (off)
  readonly          =   0 (off)
root@dhcppc10:/home/demo# fsck -p /dev/sdb
fsck from util-linux 2.36.1
fsck.ext2: Read-only file system while trying to open /dev/sdb
Disk write-protected; use the -n option to do a read-only
check of the device.

That's the output

MichaIng commented 2 years ago
fsck -p /dev/sdb2

A drive cannot be checked, only a filesystem 😉.

But the error message indicates it jumped back to readonly:

hdparm -r /dev/sdb
hdparm -r0 /dev/sdb
sleep 1
hdparm -r /dev/sdb

If this is the case, it looks like the damage is too large so that the kernel or controller kicks it into readonly automatically. Not sure how to resolve best. Probably it works to recreate the partition table.

AAS-Crypt commented 2 years ago

By this point i have no means in understanding if the recovery could even be made. I've done the command says that filesystem is read only. Right now i have two options : First being that I will give you access to VM. Second is that I will proceed with fresh install of dietpi.

пн, 4 июл. 2022 г., 1:15 AM MichaIng @.***>:

fsck -p /dev/sdb2

A drive cannot be checked, only a filesystem 😉.

— Reply to this email directly, view it on GitHub https://github.com/MichaIng/DietPi/issues/5587#issuecomment-1173156241, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASOZDVCJET53HWJVK257FYTVSHRFPANCNFSM52O3ASHQ . You are receiving this because you authored the thread.Message ID: @.***>

MichaIng commented 2 years ago

One more attempt:

sfdisk --no-reread --no-tell-kernel -fN2 /dev/sdb <<< ',+'
partprobe /dev/sdb
partx -u /dev/sdb
sleep 1
hdparm -r0 /dev/sdb
fsck -p /dev/sdb2

Probably there are other ways to try repairing the drive, but even when this is done, I guess some files are missing, which may not be obvious but cause subtile issues at a later time. So if the above doesn't work either, I'd go with a fresh install to a new/another USB stick and copy back data/configs from the old one. A USB stick should survive longer, but probably you got an unlucky piece. After all data has been copied off of it, format it cleanly, after which it can be probably used again. Depends on whether there are really dirty bits (hardware damage) or it was just some deeper data damage.

AAS-Crypt commented 2 years ago

Yes I've tried and still no result from these commands. Hopefully it's only on my memory drive. Thanks for your attempts in teaching me how to repair broken driver. With this I'll wipe USB drive and install freshly released dietpi.