IceWhaleTech / CasaOS

CasaOS - A simple, easy-to-use, elegant open-source Personal Cloud system.
https://casaos.io
Apache License 2.0
26.16k stars 1.42k forks source link

[Bug][local-storage] constantly have to do FSCK to check my os drive on reboots. #1104

Closed PeterRichardsWA closed 1 year ago

PeterRichardsWA commented 1 year ago

Describe the bug constantly have to do FSCK to check my os drive on reboots. even when a normal reboot job is submitted and its not a power outage scenario, i have to do a manual disk check or the system hangs waiting for it and wont restart. doing a remote reboot is a crap shoot on whether or not the system will come back up and when it doesnt, i have to go to my desk and do the disk check locally.

To Reproduce happens randomly but happens often. thats the issue. the only server service i have running is the plexmediaserver docker, and it will work for a while. then casaos stops accepting network connections (a sep issue altogether) and i have to go to the web interface and do a reboot. I cant even do the reboot from the gui on the local desktop because it doesnt respond. its odd.

Expected behavior just reboot and not have disk issues like lost blocks and crap like this about 9/10 times i reboot.

Screenshots N/A

Desktop (please complete the following information): macos on imacpro/macbook pro 2021 m1 max

Logs

run following command to collect corresponding logs: logs were a jumbled mess. very large. if you need them on follow up i will provide.

Additional context Add any other context about the problem here.

LinkLeong commented 1 year ago

Can you provide the version of your casaos and which version of linux is your system?

kolunda commented 1 year ago

I am pretty sure I am having this same error. I don't know if this is a bug with CasaOS or with the ZimaBoard it's installed on. It happens to me on casaos 0.4.3 running Debian 11 (Linux CasaOS 5.10.0-10-amd64 #1 SMP Debian 5.10.84-1 (2021-12-08) x86_64 GNU/Linux).

I can run journalctl -k and see that there is an mmc0: cache flush error -110 followed by an I/O error and the filesystem getting mounted read-only.

If I run sudo fsck -fy / before rebooting, the filesystem will repair and come back up functioning for an indeterminate amount of time before going read-only again with filesystem corruption. If I don't and I reboot the system, it gets stuck at the grub bootloader until I run fsck again on the volume.

Attaching output of journalctl -k and journalctl -f. Attaching screenshot of iotop showing total writes to disk corresponding with running journalctl -f in other tab.

5.25.2023_journalctl.txt iotop

kolunda commented 1 year ago

After doing an apt-get dist-upgrade to upgrade the kernel and the docker packages 3 days ago, I have not seen this issue happen again. I am now running Linux CasaOS 5.10.0-22-amd64 #1 SMP Debian 5.10.178-3 (2023-04-22) x86_64 GNU/Linux.

zhanghengxin commented 1 year ago

@zhanghengxin

LinkLeong commented 1 year ago

Thank you for your feedback, I will close the issue, if in doubt please re-open the issue.

PeterRichardsWA commented 10 months ago

just as follow up, I ran some more test. sorry for delay response, but surgeries happen. anyway, I found that my primary use for the Zimaboard and CASA os setup, running a family Plex Server, are severly taxing on this setup. The zima just does not come with enough boot drive space to make it workable as a solution with plex installed vanilla through docker if you have a large library of movies. not only have i put all of own owned DVDs on there, but all of our family videos of kids, etc. this amounts to thousands of movies. because of the way plex manages encoding, and transcoding on the fly along with management of meta data for the movies, it instantly ate the available hard drive space on boot/install drive. A massive problem as CASA OS is highly sensitive to not having the space it needs for management as well. however, I do blame CASA for not making this more full proof and showing the appropriate warning messages before it literally crashes into read only mode and makes plex unusable.

to fix this, I had to use a separate usb attached SSD that holds plex docker container and allows plex to do its magic on it. it doesnt seem to have slowed anythign down because the usb is fast enough as well as the SSD. i had to reinstall the entire system, add the drive and run enter the patches to see the external 18TB EXTfat drive that holds my movies and allowing it to run in read/write mode for plex to add and delete movies from it. Which brings up the second issue. Addin an external exFAT drive is a pain in the ass. WHY? its a very widely used format. i am not going to spend 3 days moving movies off to another drive and then reformatting to CASA OS preferred format and then copying back. Thats an expense i do not want to take on.

Since then, the system has been running pretty smoothly, but it has still experienced to locking system causing me to have to manually go down, plug in a monitor and run fsck. I just today did the update to the distribution. Hopefully that will end after this. The only other issue is the lack of update to the docker containers for plex as it has a fairly robust release schedule for updates. I feel like the docker container is always about 2-3 releases behind the actual app release in real time. I now that is not a casa team issue, or is it? I dont know where one ends and the other picks up for this.

So, in short: Hardware: Needs larger boot drive. Software: Hopefully fixed checking disk at boot with change of venue for large app, and updating distro Docker: needs to be updated more regularly.

Other than that. things seem to be back on track and i am not having to babysit constantly a server that is advertised as a server....hopefully the zima box or whatever that thing is, has addressed this as well cause having a small board with cables spaghettied out all over the place is a hassle and a half too, but thats a different thread.

thanks

PeterRichardsWA commented 10 months ago

and spoke too soon.

Intalled update to distro. For some reason it literally killed my plex install inside casaos when i rebooted. re-added it. and now file system gone into read-only mode. despite having wokring dirs on sep drive.

this is maddening.