divx118 / crouton-packages

Kernel-headers packages to use with crouton
56 stars 16 forks source link

Change-kernel-flags trashes ChromeOS, resulting in a boot loop #69

Open hypevhs opened 6 years ago

hypevhs commented 6 years ago

Note: My Chromebook ID is CAVE.

I use VirtualBox and CIFS modules, so I disable module locking using change-kernel-flags. Up until now, that script was working fine, and every time my device got an update on beta channel, I needed to re-run it. But a recent update caused the script to instead cripple the Chromebook. Even without any crouton environment installed, and from a fresh install of ChromeOS, the following steps is enough to cause a boot loop. From there, you must use the Chromebook Recovery Utility, which wipes the device.

  1. Get a fresh install of Beta channel
  2. Get sudo working
  3. Download and run sudo sh ./change-kernel-flags
  4. Reboot normally
  5. Get OS verification is OFF, press space to reenable. warning screen
  6. Either CTRL+D, or wait 30s
  7. Return to step 5!

Any suggestions?

divx118 commented 6 years ago

@libjared Thanks for reporting, I just tried to reproduce on my acer chromebook for work LARS on dev channel. Version 64.0.3270.0 (Official Build) dev (64-bit)

chronos@localhost / $ uname -a
Linux localhost 3.18.0-16327-g65e8f0196248 #1 SMP PREEMPT Thu Nov 16 03:24:10 PST 2017 x86_64 Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz GenuineIntel GNU/Linux

I however followed the new updated https://github.com/divx118/crouton-packages/blob/master/README.md since wget is gone and executing a script with sh will not be allowed anymore. For me it still worked with no problems. Can you please post your chromeos version and the output of uname -a Then I also would want to see the output of sudo change-kernel-flags -i this will do nothing with your kernel but only give info about it. Thanks.

hypevhs commented 6 years ago

I'm on stable right now, virtualization and module loading are OK. I will post again when beta channel is installed. In the meantime:

Cave, Stable Channel

chronos@localhost / $ uname -a
Linux localhost 3.18.0-16037-gf59ef0b48a68 #1 SMP PREEMPT Mon Nov 13 16:33:49 PST 2017 x86_64 Intel(R) Core(TM) m3-6Y30 CPU @ 0.90GHz GenuineIntel GNU/Linux
chronos@localhost ~/Downloads $ cat /proc/cmdline 
cros_secure console= loglevel=7 init=/sbin/init cros_secure oops=panic panic=-1 root=/dev/dm-0 rootwait ro dm_verity.error_behavior=3 dm_verity.max_bios=-1 dm_verity.dev_wait=1 dm="1 vroot none ro 1,0 3584000 verity payload=PARTUUID=e7a62fb3-f0c0-d74a-8de8-b60482f442dc/PARTNROFF=1 hashtree=PARTUUID=e7a62fb3-f0c0-d74a-8de8-b60482f442dc/PARTNROFF=1 hashstart=3584000 alg=sha1 root_hexdigest=b88364180551536f6fb39be949c71e323b989902 salt=254478eebd08e24b94c7205d2f5b54b561d95691fa489aa1fad577dbaf8086fc" noinitrd vt.global_cursor_default=0 kern_guid=e7a62fb3-f0c0-d74a-8de8-b60482f442dc add_efi_memmap boot=local noresume noswap i915.modeset=1 tpm_tis.force=1 tpm_tis.interrupts=0 nmi_watchdog=panic,lapic intel_idle.max_cstate=7   lsm.module_locking=0 disablevmx=off
chronos@localhost ~/Downloads $ sudo change-kernel-flags -i
Check if the kernel partitions are signed with vbpubk:

Saving verbose log as /tmp/debug_vboot_1nZguiiB9/noisy.log
Extracting BIOS components...
Pulling root and recovery keys from GBB...
Verify firmware A with root key: OK
  TPM=0x00010001, this=0x00010001
Verify firmware B with root key: OK
  TPM=0x00010001, this=0x00010001
Examining kernels...
Kernel /dev/mmcblk0p2: OK
  Verify /dev/mmcblk0p2 with kern_subkey_A.vbpubk: FAILED
  Verify /dev/mmcblk0p2 with kern_subkey_B.vbpubk: FAILED
  Verify /dev/mmcblk0p2 with recoverykey.vbpubk: FAILED
Kernel /dev/mmcblk0p4: OK
  Verify /dev/mmcblk0p4 with kern_subkey_A.vbpubk: FAILED
  Verify /dev/mmcblk0p4 with kern_subkey_B.vbpubk: FAILED
  Verify /dev/mmcblk0p4 with recoverykey.vbpubk: FAILED
Kernel /dev/mmcblk0p6: FAILED

Show kernel details about the installed kernels.

ERROR: Unable to stat /dev/mmcblk02: No such file or directory
ERROR: Unable to stat /dev/mmcblk04: No such file or directory
ERROR: Unable to stat /dev/mmcblk06: No such file or directory

crosssystem dev_boot* flags settings:

dev_boot_usb           = 1                              # Enable developer mode boot from USB/SD (writable)
dev_boot_legacy        = 1                              # Enable developer mode boot Legacy OSes (writable)
dev_boot_signed_only   = 0                              # Enable developer mode boot only from official kernels (writable)
hypevhs commented 6 years ago

I was worried it was going to ask me to powerwash when moving to beta, I guess not.

Cave, beta channel

chronos@localhost / $ uname -a
Linux localhost 3.18.0-16283-gde9f7f8596fa #1 SMP PREEMPT Tue Nov 14 22:13:47 PST 2017 x86_64 Intel(R) Core(TM) m3-6Y30 CPU @ 0.90GHz GenuineIntel GNU/Linux
chronos@localhost / $ cat /proc/cmdline 
cros_secure console= loglevel=7 init=/sbin/init cros_secure oops=panic panic=-1 root=/dev/dm-0 rootwait ro dm_verity.error_behavior=3 dm_verity.max_bios=-1 dm_verity.dev_wait=1 dm="1 vroot none ro 1,0 3584000 verity payload=PARTUUID=d8d07334-bf53-d249-92f4-834bf865aa2b/PARTNROFF=1 hashtree=PARTUUID=d8d07334-bf53-d249-92f4-834bf865aa2b/PARTNROFF=1 hashstart=3584000 alg=sha1 root_hexdigest=0235172128521cfe3845798c7687819f88beac24 salt=79cdd933ba5cee3dce2d256e722df3e3867481f81d8aa94b9f054eaa7113bfd3" noinitrd vt.global_cursor_default=0 kern_guid=d8d07334-bf53-d249-92f4-834bf865aa2b add_efi_memmap boot=local noresume noswap i915.modeset=1 tpm_tis.force=1 tpm_tis.interrupts=0 nmi_watchdog=panic,lapic intel_idle.max_cstate=7  
chronos@localhost / $ sudo change-kernel-flags -i
Check if the kernel partitions are signed with vbpubk:

Saving verbose log as /tmp/debug_vboot_1d7QnnQiB/noisy.log
Extracting BIOS components...
Pulling root and recovery keys from GBB...
Verify firmware A with root key: OK
  TPM=0x00010001, this=0x00010001
Verify firmware B with root key: OK
  TPM=0x00010001, this=0x00010001
Examining kernels...
Kernel /dev/mmcblk0p2: OK
  Verify /dev/mmcblk0p2 with kern_subkey_A.vbpubk: OK
    TPM=0x00010001 this=0x00010001
  Verify /dev/mmcblk0p2 with kern_subkey_B.vbpubk: OK
    TPM=0x00010001 this=0x00010001
  Verify /dev/mmcblk0p2 with recoverykey.vbpubk: FAILED
Kernel /dev/mmcblk0p4: OK
  Verify /dev/mmcblk0p4 with kern_subkey_A.vbpubk: FAILED
  Verify /dev/mmcblk0p4 with kern_subkey_B.vbpubk: FAILED
  Verify /dev/mmcblk0p4 with recoverykey.vbpubk: FAILED
Kernel /dev/mmcblk0p6: FAILED

Show kernel details about the installed kernels.

ERROR: Unable to stat /dev/mmcblk02: No such file or directory
ERROR: Unable to stat /dev/mmcblk04: No such file or directory
ERROR: Unable to stat /dev/mmcblk06: No such file or directory

crosssystem dev_boot* flags settings:

dev_boot_usb           = 1                              # Enable developer mode boot from USB/SD (writable)
dev_boot_legacy        = 1                              # Enable developer mode boot Legacy OSes (writable)
dev_boot_signed_only   = 0                              # Enable developer mode boot only from official kernels (writable)
divx118 commented 6 years ago

@libjared from stable to beta to dev there will be no powerwash only when going back dev to beta to stable there is a powerwash needed.

divx118 commented 6 years ago

@libjared Something is indeed wrong, I need to investigate further, but I had to put the backup kernel that the script creates back from a linux distro to make my chromeos again bootable. Yesterday it restarted fine after running the script, but I must admit I didn't check if it booted the right kernel. Chromeos is smart enough to boot the second one when the first one fails. I did however get a black screen this morning. After putting the kernel back chromeos starts fine again.

divx118 commented 6 years ago

Ok, I think I narrowed it down. When running the script it adds 2 flags lsm.module_locking=0 disablevmx=off on reboot it goes fine for me, however when I close the lid chromeos will go to standby and my display stays black after opening the lid or on reboot. Only way back is to dd the backup kernel from a linux distro. Now I tried with sudo change-kernel-flags -f "lsm.module_locking=0" so it only adds lsm.module_locking=0 then it will be fine. For now I will remove the disablevmx=off flag from the script and see if we can find something in the latest kernel commits from chromeos that could be causing this. I also will add a small guide to the readme.md on how to dd back the kernel from a linux distro. @libjared Let me know if sudo change-kernel-flags -f "lsm.module_locking=0" works for you on dev channel.

hypevhs commented 6 years ago

Cave, developer channel

Saving verbose log as /tmp/debug_vboot_4yAj11IFy/noisy.log Extracting BIOS components... Pulling root and recovery keys from GBB... Verify firmware A with root key: OK TPM=0x00010001, this=0x00010001 Verify firmware B with root key: OK TPM=0x00010001, this=0x00010001 Examining kernels... Kernel /dev/mmcblk0p2: OK Verify /dev/mmcblk0p2 with kern_subkey_A.vbpubk: OK TPM=0x00010001 this=0x00010001 Verify /dev/mmcblk0p2 with kern_subkey_B.vbpubk: OK TPM=0x00010001 this=0x00010001 Verify /dev/mmcblk0p2 with recoverykey.vbpubk: FAILED Kernel /dev/mmcblk0p4: OK Verify /dev/mmcblk0p4 with kern_subkey_A.vbpubk: OK TPM=0x00010001 this=0x00010001 Verify /dev/mmcblk0p4 with kern_subkey_B.vbpubk: OK TPM=0x00010001 this=0x00010001 Verify /dev/mmcblk0p4 with recoverykey.vbpubk: FAILED Kernel /dev/mmcblk0p6: FAILED

Show kernel details about the installed kernels.

ERROR: Unable to stat /dev/mmcblk02: No such file or directory ERROR: Unable to stat /dev/mmcblk04: No such file or directory ERROR: Unable to stat /dev/mmcblk06: No such file or directory

crosssystem dev_boot* flags settings:

dev_boot_usb = 1 # Enable developer mode boot from USB/SD (writable) dev_boot_legacy = 1 # Enable developer mode boot Legacy OSes (writable) dev_boot_signed_only = 0 # Enable developer mode boot only from official kernels (writable)



Will now attempt to add the `lsm.module_locking` argument only.
hypevhs commented 6 years ago

It boots fine after running this command:

chronos@localhost / $ sudo change-kernel-flags -f "lsm.module_locking=0"
Password: 
/tmp/change-kernel-flags.xtT.4
make_dev_ssd.sh: INFO: Saving Kernel B config to /tmp/change-kernel-flags.xtT.4
make_dev_ssd.sh: INFO: (Kernels have not been resigned.)

Kernel flags added or changed are:
" lsm.module_locking=0"

Full cmdline is:
console= loglevel=7 init=/sbin/init cros_secure oops=panic panic=-1 root=/dev/dm-0 rootwait ro dm_verity.error_behavior=3 dm_verity.max_bios=-1 dm_verity.dev_wait=1 dm="1 vroot none ro 1,0 3584000 verity payload=PARTUUID=%U/PARTNROFF=1 hashtree=PARTUUID=%U/PARTNROFF=1 hashstart=3584000 alg=sha1 root_hexdigest=33295c8564db6832d6f896016f80afce3f4bfaba salt=a22a16089864b6d5dd580949db6db5393ddd8dd79e1cefffb00309ae44dbf516" noinitrd vt.global_cursor_default=0 kern_guid=%U add_efi_memmap boot=local noresume noswap i915.modeset=1 tpm_tis.force=1 tpm_tis.interrupts=0 nmi_watchdog=panic,lapic intel_idle.max_cstate=7   lsm.module_locking=0

Do you want to apply those changes (y/N)?y
 make_dev_ssd.sh: INFO: Kernel B: Replaced config from /tmp/change-kernel-flags.xtT.4
make_dev_ssd.sh: INFO: Backup of Kernel B is stored in: /mnt/stateful_partition/backups/kernel_B_20171126_120921.bin
make_dev_ssd.sh: INFO: Kernel B: Re-signed with developer keys successfully.
make_dev_ssd.sh: INFO: Successfully re-signed 1 of 1 kernel(s)  on device /dev/mmcblk0.
dev_boot_usb           = 1                              # Enable developer mode boot from USB/SD (writable)
dev_boot_legacy        = 1                              # Enable developer mode boot Legacy OSes (writable)
dev_boot_signed_only   = 0                              # Enable developer mode boot only from official kernels (writable)

Reboot to make the changes take effect.

I might try to add the vmx argument soon, just to narrow it down further.

hypevhs commented 6 years ago

Just in case you needed additional proof, I ran sudo change-kernel-flags -f "disablevmx=off", rebooted, and immediately got stuck in the boot loop again. Great work isolating the issue!

If there is a way to load arbitrary kernel versions, git bisect seems like the best way to proceed.

divx118 commented 6 years ago

@libjared Yeah, looked at the latest kernel commits, but couldn't find anything that popped out at first sight. Maybe @drinkcat has an idea why enabling vmx is causing a reboot/crash at the moment.

divx118 commented 6 years ago

I found this crbug It seems the same issue, since I noticed that they backported kaiser also to 3.18

hypevhs commented 6 years ago

The bug seems to have been fixed and merged into 4.14. Now I have to wait until they cherry pick it onto 3.18, and then it makes it into a dev-channel update.

hypevhs commented 6 years ago

Dev channel recently updated from

3.18.0-16503-ge33b03ba1f58-dirty

to

3.18.0-16977-gc20001a59640

It's a huge update and mentions "vmx" in git show. I'd love to test it (and the kernel param that disables kaiser) but if it doesn't fix anything, and I get hit with the bootloop, I'd have to powerwash yet again. How do you recover from the bootloop without powerwashing local data?

divx118 commented 6 years ago

@libjared Well you can, but you will need to boot a linux distro on your chromebook and dd the backup kernel back manually. I see if I have some time tomorrow or on in the weekend to write a short step by step guide

divx118 commented 6 years ago

I tried it with

chronos@localhost / $ uname -a
Linux localhost 3.18.0-17049-gd99a5b37a7d2 #1 SMP PREEMPT Thu Mar 22 14:17:10 PDT 2018 x86_64 Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz GenuineIntel GNU/Linux

Still getting the bootloop with disablevmx=off