dnschneid / crouton

Chromium OS Universal Chroot Environment
https://goo.gl/fd3zc?si=1
BSD 3-Clause "New" or "Revised" License
8.56k stars 1.24k forks source link

Suspend/resume breaks chroot when running off external SDC card #288

Closed stsquad closed 2 years ago

stsquad commented 11 years ago

On my Chromebook Pixel it looks like a suspend/resume cycle leaves the chroot in an odd state.

04:14 alex@localhost/x86_64 [~] >echo "before suspend" before suspend 04:14 alex@localhost/x86_64 [~] >echo "after suspend" after suspend -su: history: /home/alex/.bash_history: cannot create: Input/output error 11:15 alex@localhost/x86_64 [~] >

Looking at the dmesg it looks like there are ext4 errors on the SDC card which forces a re-mount breaking the chroot (or possibly the SDC is just unmounted as part of the suspend cycle).

ss Generic STORAGE DEVICE 0207 PQ: 0 ANSI: 0 [15015.434666] sd 12:0:0:0: [sdf] 125827072 512-byte logical blocks: (64.4 GB/59.9 GiB) [15015.435935] sd 12:0:0:0: [sdf] Write Protect is off [15015.435946] sd 12:0:0:0: [sdf] Mode Sense: 0b 00 00 08 [15015.437146] sd 12:0:0:0: [sdf] No Caching mode page present [15015.437156] sd 12:0:0:0: [sdf] Assuming drive cache: write through [15015.441164] sd 12:0:0:0: [sdf] No Caching mode page present [15015.441174] sd 12:0:0:0: [sdf] Assuming drive cache: write through [15015.448779] sdf: sdf1 [15015.452132] sd 12:0:0:0: [sdf] No Caching mode page present [15015.452142] sd 12:0:0:0: [sdf] Assuming drive cache: write through [15015.452151] sd 12:0:0:0: [sdf] Attached SCSI removable disk [15016.731447] EXT4-fs (sdf1): recovery complete [15016.736860] EXT4-fs (sdf1): mounted filesystem with ordered data mode. Opts: [15024.898066] wlan0: no IPv6 routers present [15029.857907] EXT4-fs error (device sde1): ext4_find_entry:935: inode #3020490: comm bash: reading directory lblock 0 [15029.857941] EXT4-fs error (device sde1): ext4_read_inode_bitmap:171: comm bash: Cannot read inode bitmap - block_group = 368, inode_bitma p = 12058640 [15029.857954] EXT4-fs error (device sde1) in ext4_new_inode:895: IO failure [15029.857993] EXT4-fs error (device sde1): ext4_find_entry:935: inode #2884078: comm bash: reading directory lblock 0 [15029.858071] EXT4-fs error (device sde1): ext4_find_entry:935: inode #2883586: comm bash: reading directory lblock 0 [15029.858090] EXT4-fs error (device sde1): ext4_find_entry:935: inode #2883586: comm bash: reading directory lblock 0 [15035.044833] Aborting journal on device sde1-8. [15035.044845] Buffer I/O error on device sde1, logical block 7372800 [15035.044851] lost page write due to I/O error on sde1 [15035.044857] JBD2: Error -5 detected when updating journal superblock for sde1-8. [15092.013341] EXT4-fs error (device sde1): ext4_find_entry:935: inode #3020490: comm bash: reading directory lblock 0 [15092.013363] EXT4-fs error (device sde1): ext4_find_entry:935: inode #2884078: comm bash: reading directory lblock 0 [15092.013395] EXT4-fs error (device sde1): ext4_find_entry:935: inode #2883587: comm bash: reading directory lblock 0 [15092.013410] EXT4-fs error (device sde1): ext4_find_entry:935: inode #2884078: comm bash: reading directory lblock 0 [15092.014147] EXT4-fs error (device sde1): ext4_journal_start_sb:328: Detected aborted journal [15092.014160] EXT4-fs (sde1): Remounting filesystem read-only [15094.359077] EXT4-fs error (device sde1): ext4_find_entry:935: inode #3020490: comm bash: reading directory lblock 0

stsquad commented 11 years ago

Looking at enter-chroot I see it does attempt to prevent powerd from persisting usb mounts. I wonder if it's the same sort of thing?

dnschneid commented 11 years ago

Yeah, enter-chroot tries to prevent powerd from disabling usb mount persistence, but it only seems to work part of the time for me.

Is it failing if you let the system idle suspend, or when you close the lid, or both?

stsquad commented 11 years ago

Both as far as I can tell - certainly the file-browser keeps popping up after both. Looking at the sed statement I don't think anything is being patched now, possibly the powerd script has been updated?

stsquad commented 11 years ago

@dnschneid yeah I reckon this upstream commit broke the crouton hack:

http://git.chromium.org/gitweb/?p=chromiumos/platform/power_manager.git;a=commitdiff;h=0007a546853a529064772b1b96bd9164dece8c46

dnschneid commented 11 years ago

Cool, then that should be fixable by adding the line to the script in a reasonable location.

stsquad commented 11 years ago

Even simpler I think we can just get away with enabling USB persist as we enter the chroot and skip the copy/mount powerd stuff as it's not going to attempt to turn it off anyway. I'll have a play.

dnschneid commented 11 years ago

Good call. We just need to make sure that patch has landed in the stable channel for all platforms before removing the copy/mount hack stuff.

dnschneid commented 11 years ago

Yep, looks like that patch has landed in stable.

dnschneid commented 10 years ago

This fix may have stopped working.

stsquad commented 10 years ago

I've noticed file-manger popping up (a sign the remount has been detected). I've yet to notice a crouton session getting killed.

mark0978 commented 10 years ago

On my Acer C710, its happening all the time right now. The only way my chroot lives across sleep/resume is on the internal storage. This effectively renders the SD Card useless.

phoenix00a commented 10 years ago

So, I've got to forget about using my SD Card and install crouton internally? Yikes. I've only got 16GB on my HP Pavilion 14 Chromebook..

mark0978 commented 10 years ago

Yea, it sucks Same on the Acert C710

dnschneid commented 10 years ago

Yeah, major suck. I've looked through kconfig commit logs to see if anything other than the default persistence setting was changed, but I didn't find much, assuming this isn't buggy. I'll need to ask one of the kernel guys what's stopping USB persist from working.

Of course, I've done this checking without actually confirming that the crouton code is still doing what it's supposed to be doing. If anyone wants to check it before me, make sure the echo 1 line in enter-chroot is getting called, and the echo 0 line in unmount-chroot isn't being called at an incorrect time.

dnschneid commented 10 years ago

Try the latest crouton; I restructured the mount system to fix another issue, and I think it might have resolved this as well.

dnschneid commented 10 years ago

Hmm, it looks like it's persisting, but there are journal errors upon resume so it gets remounted RO...unless that's just my SD card.

dnschneid commented 10 years ago

Okay, something's wrong with superblock access which is causing the device to be error-mounted RO when replaying the journal. I'll investigate what is causing this in Chromium OS. In the meantime, you have two options:

  1. Set the partition to ignore errors and mount anyway. This is such a terrible idea that I'm not even going to post the command here.
  2. Disable journaling. This significantly increases the risk of data loss when things go poorly, so it's still a pretty bad idea. sudo umount /dev/sdb1 && sudo tune2fs -O ^has_journal /dev/sdb1 and then unplug and re-plug the SD card. You can re-enable journaling at a future date.
dnschneid commented 10 years ago

Relevant bug. Star it if you're affected.

tocker commented 10 years ago

Is it possible to manually remount the sdcard as RW? I tried a few variations of mount but none worked.

mark0978 commented 10 years ago

Yes you can remount it. But the suspend/remount destroys any hope of recovery from sleep. On Jan 28, 2014 8:59 AM, "tocker" notifications@github.com wrote:

Is it possible to manually remount the sdcard as RW? I tried a few variations of mount but none worked.

Reply to this email directly or view it on GitHubhttps://github.com/dnschneid/crouton/issues/288#issuecomment-33479939 .

NullVoxPopuli commented 10 years ago

I am experiencing this issue on the HP11.

Other issues I've noticed that I think may be related:

my uname -a from chrome OS: Linux localhost 3.4.0 #1 SMP Tue Feb 18 23:28:43 PST 2014 armv7l ARMv7 Processor rev 4 (v7l) SAMSUNG EXYNOS5 (Flattened Device Tree) GNU/Linux

dnschneid commented 10 years ago

The relevant crbug has a request for someone to reproduce and provide an event log. Any takers?

scosol commented 10 years ago

I will try to reproduce after I can install my own chroot again: https://github.com/dnschneid/crouton/issues/711

dnschneid commented 10 years ago

Thanks for volunteering; #712 should be merged tomorrow. Autotest went down while I was away, and I can't merge without running them through the tests.

scosol commented 10 years ago

Is this still relevant? Reading through the months-long CR comments I can't tell exactly what the next step is-

Upon Suspend, everything should sync and flush, because even in a maintained-memory state there's no way of knowing if/when things will come back on, and the FS needs to be clean and consistent outside of that- the only caveat to that model is if there's a persistent heavy I/O job that makes it not actually ever Suspend. Thankfully in Chromeland we don't have to deal with HW RAID devices that like to lie about their state. I'm re-croutoning my USB stick right now and can then test, but is there anything else that would be useful beyond dmesg output?

-SS

NUNQUAM NON PARATUS ☤ INCITATUS ÆTERNUS ヽ(´◇`)ノ

V/T: 00.1.408.718.6290

Skype: Scott Solmonson

On Thu, Mar 27, 2014 at 2:57 PM, David Schneider notifications@github.comwrote:

Thanks for volunteering; #712https://github.com/dnschneid/crouton/pull/712should be merged tomorrow. Autotest went down while I was away, and I can't merge without running them through the tests.

— Reply to this email directly or view it on GitHubhttps://github.com/dnschneid/crouton/issues/288#issuecomment-38866009 .

mark0978 commented 10 years ago

I think it is still relevant. You expect the machine with a SD card in use to behave the same as the internal drive behaves. I can suspend and resume into my crouton with no issue, unless that crouton is on an external SD card. There is NO good reason for this behavior. My guess is, it was too hard to deal with a suspend resume on an external device and someone took the shortcut of unmount/remount, something they DO NOT DO with the internal drive. So, yes, it is relevant, it impacts my ability to use a device that has 16GB internally and 32GB externally.

tedm commented 10 years ago

Since the chroot can't run independent of the Chromium OS host, I can't understand why a chroot running under the Chromium OS host should have independent power, input, display, i/o settings, if it risked the stability of the host.

xscreensaver, for example, causes all kinds of problems, trying to put the chroot asleep or awake, without knowing the state of the host.

DennisLfromGA commented 10 years ago

Not a total solution by any means but you can uninstall xscreensaver, I've had to do it myself on a chroot with multiple DE's, each having their own screensaver.

On Mon, Mar 31, 2014 at 9:21 PM, tedm notifications@github.com wrote:

Since the chroot can't run independent of the Chromium OS host, I can't understand why a chroot running under the Chromium OS host should have independent power, input, display, i/o settings, if it risked the stability of the host.

xscreensaver, for example, causes all kinds of problems, trying to put the chroot asleep or awake, without knowing the state of the host.

— Reply to this email directly or view it on GitHubhttps://github.com/dnschneid/crouton/issues/288#issuecomment-39161523 .

DennyL@GMail

mark0978 commented 10 years ago

Maybe the problem doesn't belong to Crouton, but I bet the guys doing Crouton know the guys doing ChromeOS a lot better than the ChromeOS users do. Maybe this kind of feedback needs to go upstream. It does impact the ability to use ChromeOS and Crouton.

Without Crouton, I have no need for ChromeOS, I'd just use android. Crouton gives me a workable solution for travel in a form factor I'm willing to risk losing/having destroyed. Not going to take my $2K+ laptop places I would take my $250 Chromebook.

scosol commented 10 years ago

Let's remember that by definition, crouton is userland, which both provides the benefits and the deficiencies of relying upon the G-approved Kernel. This is Linux on Desktop finally actually playing a hand, and I'm here to make sure it finally sticks.

TedM- my suggestions around an rc-local kind of mindscape aren't meant to disable the competency of the kernel at all; instead it's more meant to achieve a user-level "known environment" that can become quickly familiar, regardless of whatever Mother-G forces on the next update. Your "send it upstream" sentiment is I think precise- engineers build things according to their own mind-model, which is often in conflict with the actual userbase and UI/UX, and we the actual users are here to keep things in check- so how far upstream do you want to go? Everything and anything can be forked on to a different path with a few keystrokes and a driven mind, but I sincerely believe that's not the best course forward for now-

-SS

NUNQUAM NON PARATUS ☤ INCITATUS ÆTERNUS ヽ(´◇`)ノ

V/T: 00.1.408.718.6290

Skype: Scott Solmonson

On Mon, Mar 31, 2014 at 6:46 PM, Mark Jones notifications@github.comwrote:

Maybe the problem doesn't belong to Crouton, but I bet the guys doing Crouton know the guys doing ChromeOS a lot better than the ChromeOS users do. Maybe this kind of feedback needs to go upstream. It does impact the ability to use ChromeOS and Crouton.

— Reply to this email directly or view it on GitHubhttps://github.com/dnschneid/crouton/issues/288#issuecomment-39162732 .

dnschneid commented 10 years ago

I suggest looking at my comment in the crbug and see if you see the same thing upon resume.

dnschneid commented 10 years ago

Here's a quick hack to try (as noted in the crbug): copy /usr/bin/powerd_suspend to /usr/local/bin/powerd_suspend. Edit /usr/local/bin/powerd_suspend to add the following on the line immediately after the call to sync: for m in /media/removable/*; do [ -d "$m" ] && mount -o remount,ro "$m"; done. Add the following after the initctl call: for m in /media/removable/*; do [ -d "$m" ] && mount -o remount,rw "$m"; done. Finally, in crosh, bind mount over the /usr/bin/powerd_suspend: sudo mount --bind /usr/local/bin/powerd_suspend /usr/bin/powerd_suspend. Do your suspend and resume and see if things don't break.

mark0978 commented 10 years ago

I will give it a try this evening.

tocker commented 10 years ago

David, I gave your hack a try and so far it seems to work!

* Update * Nope, still failing...

(If you need logs, etc. let me know and I'll post it here.)

Thanks!

yisheng-on-linux commented 10 years ago

How's the progress on this? I just posted a long list of my experiences at https://code.google.com/p/chromium/issues/detail?id=208380 outlining what happens when, so I won't repeat here unless asked. I think it's a powerd issue as you can see. If you can manage the power drain (I can go a day like this so not too bad but does get warm in a bag) do this before starting crouton: sudo stop powerd As you will see, the usb drive keeps the drain up even when closed, so you aren't really losing much more than now, but will save all those pesky fs errors...

forgot to mention another side benefit - this disables the power button too, so (at least for me) I don't have to worry about picking it up while open and laying my thumb on the power button and shutting it down while I blink.... for the hundreth time...

faddah commented 9 years ago

@dnschneid -

i have seen the same thing, as recently as today, 11-Oct.2014. i have ubuntu linux running off an sd card (using your -p option/switch in crouton) which i had named "linux." this morning on a reboot, it's now called SD\ Card - and i never renamed it.

i also posted this in the related issue #445.

best,

-- faddah wolf portland, oregon, u.s.a. github.com/faddah

Jackhford commented 9 years ago

I don't know if this is relevant to the topic, but I had to do a reinstall. I'm trying to finish the virtualbox setup, but I can't create rc.local in /etc, because I get the filesystem is read-only error. I hope someone has an easy solution. Thanks

hsharrison commented 9 years ago

I think the only workaround is to reboot ChromeOS, then run sudo stop powerd in crosh to prevent sleep. It's the only thing that's worked for me.

It's too bad, I just "upgraded" to a newer Chromebook from one where you could swap out the SSD for something bigger. I thought I would be fine with the SD card. I was wrong.

ee7klt commented 9 years ago

/usr/bin/powerd_suspend doesn't exist for me. do i just create one and add the hack to it?

dylanPowers commented 9 years ago

Is there a way to stop the system from sleeping without killing the ability to change brightness?

dnschneid commented 9 years ago

If the lid is left open, you can use croutonpowerd -i to inhibit going to sleep. To avoid the lid close, you have to disable the lid action which I believe requires sending a protobuf via dbus to powerd. It may be possible to pregenerate the protobuf data and then send the blob, but I haven't looked into it.

grobbie commented 9 years ago

I was able to get suspend / resume working without having the sd card go readonly on resume.

I moved the chroot user home directory onto the sd card using rsync in order to save space and moved the chroot back onto the ssd.

sudo rsync -aXS --exclude='/*/.gvfs' /home/. /media/removable/<sdcard>/chroot/home/.

I edited the chroot's /etc/passwd and set my home directory to the path on the sd card. I removed the home directory in the chroot and migrated the chroot back to the ssd drive in the chromebook.

NB. I had a problem running IntelliJ from the sd card, since ChromeOS mounts it with noexec. Remounting got rid of it.

sudo mount -o remount,exec /dev/sdb1 /var/host/media/removable/<sdcard>

I then followed the instruction dnschneid gave already:

Add the following after the initctl call: for m in /media/removable/*; do [ -d "$m" ] && mount -o remount,rw "$m"; done. Finally, in crosh, bind mount over the /usr/bin/powerd_suspend: sudo mount --bind /usr/local/bin/powerd_suspend /usr/bin/powerd_suspend. Do your suspend and resume and see if things don't break.

Seems to suspend/resume predictably. Certainly not perfect, but since the ChromeOS kernel now prevents usb-persist "in order to save 500ms", no other choice.

erinzm commented 9 years ago

Could I recompile my ChromeOS kernel with usb-persist enabled?

ee7klt commented 9 years ago

hi grobbie, can you please elaborate on "I edited the chroot's /etc/passwd and set my home directory to the path on the sd card. I removed the home directory in the chroot and migrated the chroot back to the ssd drive in the chromebook.". In particular, can you provide sample cod for each stop of the way? thanks.

yisheng-on-linux commented 9 years ago

I've been running with powerd off so I've been away from this a while, but I went back and tried a few things and the latest update(s) seem to have changed things!

I will keep testing, but I did add the following in my own startup script before doing enter-chroot. Testing on another machine without that seems to say it isn't necessary though... I have done none of the ...bin/powerd_suspend changes or moved any files to the internal drive either.

I hope I'm not missing something here.....

added before doing enter-chroot, but may remove it soon: for usbp in /sys/bus/usb/devices/*/power/persist do if [ -e "$usbp" ] then echo 1 > "$usbp" fi done

erinzm commented 9 years ago

@dnschneid How is usb-persist being disabled affecting the MMC/SD card reader? The SD card reader, to my knowledge, isn't USB-based.

dnschneid commented 9 years ago

@ArchimedesPi depends on the platform. It's USB-based on most Intel platforms. ARM platforms it's usually implemented by an integrated SDIO controller.

erinzm commented 9 years ago

OK, so what's the best way to get around this bug for SD cards. I tried some other solutions, but they all seem to work only on USB sticks.

Any ideas @dnschneid?

On Mon, Jan 12, 2015 at 11:33 AM, David Schneider notifications@github.com wrote:

@ArchimedesPi https://github.com/ArchimedesPi depends on the platform. It's USB-based on most Intel platforms. ARM platforms it's usually implemented by an integrated SDIO controller.

— Reply to this email directly or view it on GitHub https://github.com/dnschneid/crouton/issues/288#issuecomment-69609618.

jjg commented 9 years ago

What's the current work-around for this? I read through these comments and the posts on the Chromium bug tracker but it's not clear to me what can be done about it at this point.

When this happens to me (for example, when I close the lid) I get the "Whoa!" warning when I open the lid back up, but worse I can't see the SD Card again until I reboot. My old Chromebook (HP 11, ARM, the one with the recalled power supply) did this but when I opened the lid the flash storage (a USB drive) would come back and could kill off my Crouton terminals and restart them, but on the new machine (Lenovo Thinkpad 11e Yoga Chromebook, Intel) I can't get any removable storage that was plugged in during suspend to come back (SD Card or USB).

For now I'm running Crouton off internal storage and looking into ways I can increase that storage, but ideally I'd be able to go back to running Crouton off removable flash. I can't help but feel like there should be a way to treat an SD Card the same way the internal flash storage is treated during suspend but perhaps it's just a hard reality of how the removable media bus is setup.

Is disabling suspend the only solution at this time?

jjg commented 9 years ago

Looks like there's another bug report that might help resolve this:

https://code.google.com/p/chromium/issues/detail?id=434372