1000001101000 / Debian_on_Buffalo

Tools for Installing/Running Debian on Buffalo ARM based Linkstation/Terastation/Kurobox/Cloudstor devices.
333 stars 41 forks source link

Boot Failed: Tried installing fancontrol onto a new LS220D Bullseye installation. It didn't go well. #154

Closed mandeldebugger closed 2 years ago

mandeldebugger commented 2 years ago

After noticing the fan being noisy at constant 100%, I tried installing apt-get install fancontrol according to instructions here. After package download, the installation made the CPU top out at 100% and stay there for two hours or so. I rebooted but it never recovered. Powered it down and up again a few times, no ping responses, no OMV landing page. It appears to have left this world.

Edit: Tried some of the recovery methods listed here and here:

On hard reset, LED lights red, after releasing the function button the top LED flashes white for a minute or so, then stays off. The partition table must be borked.

I am left wondering - how on earth does apt-get install fancontrol do that to the OS? I've got USB <-> SATA cables arriving tomorrow to autopsy the disk and try to find out what happened.

1000001101000 commented 2 years ago

Good to hear from you!

I don’t have a specific theory about what happened. My limited experience with OMV involved it messing with network and disk configuration stuff in ways that broke the underlying OS on the next boot.

If the system was still online there are some things we could do to investigate the 100% cpu issue, I wouldn’t expect fancontrol to be the cause but also can’t be sure.

You could always connect the drives to a PC and look at logs/etc to try to figure out the startup issue. Failing that you could load the installer files again and start over with a fresh install.

mandeldebugger commented 2 years ago

I post-mortemed the disk and noticed that the files in the /boot partition were replaced, presumably by apt-get dist-upgrade or by something updating the boot image. Whatever it was made the drive unbootable so I won't be running that again!

Restoration steps were:

  1. Format the new drive(s) with FAT32.
  2. Load drives into both bays. If you load a drive into one bay, the power LED lights will give 7 x red (fatal HDD controller error).
  3. Start the LS220D device.
  4. Run Nas Navigator 2.exe to check the presence of the device.
  5. Run LSUpdater.exe to update the firmware to v1.78 (or latest). This appears to be needed to correctly ...
  6. Reformat drive(s) from 1 to XFS.
  7. Run acp_commander.jar as above.

Nas Navigator 2.exe sees it again now, but only with both drives. I think you've highlighted this before @1000001101000 - the LS220D has to have both drives present in order to initiate the firmware restoration from first boot and the LS220D needs to have up to date .

mandeldebugger commented 2 years ago

My limited experience with OMV involved it messing with network and disk configuration stuff in ways that broke the underlying OS on the next boot.

Do you remember what triggered it?

That has to be the issue - if any OMV update process replaces files in the /boot directory (as I think I've observed), it's worth pointing out the culprit. I'm trying to work out whether it is the fancontrol OMV add-on or merely the OMV update process that makes it unbootable.

1000001101000 commented 2 years ago

I post-mortemed the disk and noticed that the files in the /boot partition were replaced, What changed about them?

Do you remember what triggered it? I do not, it was about 5 years ago

I would expect any reformatting/repartitioning via OMV would risk messing up the install.

mandeldebugger commented 2 years ago

The boot images bad been moved to .bak extensions (like uImage.bak etc) and newer versions had been put in.

I might need to go back and re-run some update things and see what breaks.

1000001101000 commented 2 years ago

That is normal behavior for kernel updates or anything that generates a new initrd, on it's own it doesn't indicate an issue.

What version are you dist-upgrade-ing from/to?

mandeldebugger commented 2 years ago

What version are you dist-upgrade-ing from/to?

Yes, agreed, it should be normal behaviour, but wasn't in this case. I'm running the reinstall of the OS, then I'm going to do a vanilla apt dist-update and apt dist-upgrade and reboot to see if I can replicate it.

1000001101000 commented 2 years ago

What version of Debian are you running?

What version are you trying to upgrade to?

mandeldebugger commented 2 years ago

What version of Debian are you running?

What version are you trying to upgrade to?

It is indeed a versioning problem.

I'm running your Debian 5.10.149-1 (2022-10-17) armv7l GNU/Linux image, but the earliest one OMV supports is Debian 6.

I note the safe method to update the kernel is apt-get upgrade linux-image-armmp Is there a safe way to upgrade between your Debian images?

1000001101000 commented 2 years ago

Linux Kernel 5.10.149 is the Current current version for Debian 11 (Bullseye). I think you'll find that linux-image-armmp is the kernel that you have installed (That's the only one my installer uses for that device).

mandeldebugger commented 2 years ago

Linux Kernel 5.10.149 is the Current current version for Debian 11 (Bullseye). I think you'll find that linux-image-armmp is the kernel that you have installed (That's the only one my installer uses for that device).

Ah, I see. Yes, it is. This is a bit of a mystery because I installed the OMV version using these instructions and it was fine for a while until (I suspect) something happened while I did an 'apt-get install fancontrol'.

Reading around, it appears that some have had issues with OMV4 and OMV5 on the LS220D because of the low memory & CPU. I don't doubt it - it maxes out at 100% CPU quite a lot so it might not be very suitable for the device. If anyone's interested I might try again but for the time being I'm giving webmin a go instead.

Thanks for your help.

1000001101000 commented 2 years ago

Fun fact: I fixed a typo with those instructions a few years ago. https://github.com/openmediavault/openmediavault-docs/commit/57696085834e45aec2dddba07aaabe18bdbbe7bb

I know webmin can max out the CPU on older generation devices at strange times but I think folks have used it on the LS220 successfully. I played with cockpit a while back and liked it though it has far fewer features than the others.

My main advice is to learn how to do this stuff using command line tools but I also know that takes time.

mandeldebugger commented 2 years ago

Actually, I've changed my mind and decided to soldier on with OMV once more. I'm just finishing the installation. I'm also carefully documenting outputs I get.

I also did apt-get install fancontrol with no issues before installing OMV and I agree - I think they're probably unrelated.

I've already noted an issue with the OMV installer script when running omv-confdbadm populate, because /usr/sbin is not in the PATH, which is easily fixed by export PATH=$PATH:/usr/sbin. I'm about to reboot ...