bedrocklinux / bedrocklinux-userland

This tracks development for the things such as scripts and (defaults for) config files for Bedrock Linux
https://bedrocklinux.org
GNU General Public License v2.0
597 stars 62 forks source link

strat: unable chroot() to /bedrock/strata/arch-arm #268

Open firejoust opened 2 years ago

firejoust commented 2 years ago

Hi, I have installed bedrock Linux onto my Nintendo Switch running Ubuntu 18.04.6. I ran the aarch64 script, changed some switchroot bootloader configuration and was eventually able to boot into bedrock without any errors.

The problem arises after executing brl strat arch-arm bash. I get the following error: strat: unable chroot() to /bedrock/strata/arch-arm

This also terminates /bedrock/run/profile after boot, leaving the bedrock strata in a "broken" state until brl repair bedrock is executed. This changes it back to "enabled" on brl status, but throws ERROR: unexpected error occured (debug log here: https://pastebin.com/AidtX1uq)

I can run commands located in the arch-arm rootfs using chroot /bedrock/strata/arch-arm/ bash, but commands such as su will instead return strat: unable chroot() to /bedrock/strata/ubuntu.

Any help would be greatly appreciated!

paradigm commented 2 years ago

Hi, I have installed bedrock Linux onto my Nintendo Switch running Ubuntu 18.04.6. I ran the aarch64 script, changed some switchroot bootloader configuration and was eventually able to boot into bedrock without any errors.

Very cool!

The problem arises after executing brl strat arch-arm bash. I get the following error: strat: unable chroot() to /bedrock/strata/arch-arm

I've never seen that error actually trigger before, which is probably why the typo that dropped the word "to" from that error message has been able to stick around for so long.

This also terminates /bedrock/run/profile after boot, leaving the bedrock strata in a "broken" state until brl repair bedrock is executed. This changes it back to "enabled" on brl status, but throws ERROR: unexpected error occured (debug log here: https://pastebin.com/AidtX1uq)

This broadly makes sense. strat is called under-the-hood quite a bit; if it is broken, other things will be as well. From the debug log, brl status is bailing because of this line:

+ /bedrock/bin/strat -r arch-arm /bin/sh -c '. /etc/profile ; env'

Various distro packages add things like $PATH entries via /etc/profile.d/* files. This bit of Bedrock code is launching a shell in each stratum to try and collect those entries so it can ensure the resulting actual shell will have them set properly. Since strat arch-arm is broken, so is this. I think if we solve the first issue, this will also be resolved.

FWIW, I plan to completely rework this subsystem in Bedrock Linux 0.8 Naga so that it'll be both faster and more robust to this kind of failure mode.

I can run commands located in the arch-arm rootfs using chroot /bedrock/strata/arch-arm/ bash, but commands such as su will instead return strat: unable chroot() to /bedrock/strata/ubuntu.

While strat does much more than this, the actual error message you provided occurs when it tries to do the rough equivalent of chroot /bedrock/strata/arch-arm. The fact that chroot works but strat doesn't makes this a very weird issue.

su failing like that makes sense. The arch-arm stratum lacks su locally, and so the Bedrock infrastructure tries to automatically launch another stratum's instance (in this case ubuntu's) via strat. Since strat is failing, this failing as well isn't surprising.

firejoust commented 2 years ago

Thanks for responding, this issue happens both with or without root. Here is the strace log: log.txt

firejoust commented 2 years ago
14T13:01:49.301999990+1000 */, st_ctime_nsec=301999990}, AT_SYMLINK_NOFOLLOW) = 0
19866 13:10:22 newfstatat(AT_FDCWD, "/bedrock", {st_dev=makedev(179, 2), st_ino=6967297, st_mode=S_IFDIR|0755, st_nlink=12, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=4096, st_atime=1652326374 /* 2022-05-12T13:32:54.173343643+1000 */, st_atime_nsec=173343643, st_mtime=1652251624 /* 2022-05-11T16:47:04.711999976+1000 */, st_mtime_nsec=711999976, st_ctime=1652497295 /* 2022-05-14T13:01:35.575999995+1000 */, st_ctime_nsec=575999995}, AT_SYMLINK_NOFOLLOW) = 0
19866 13:10:22 chdir("/")               = 0
19866 13:10:22 chroot("/bedrock")       = 0
19866 13:10:22 chdir("..")              = 0
19866 13:10:22 newfstatat(AT_FDCWD, ".", {st_dev=makedev(0, 1), st_ino=1, st_mode=S_IFDIR|0755, st_nlink=14, st_uid=1000, st_gid=1000, st_blksize=4096, st_blocks=0, st_size=0, st_atime=1622885558 /* 2021-06-05T19:32:38+1000 */, st_atime_nsec=0, st_mtime=1622885558 /* 2021-06-05T19:32:38+1000 */, st_mtime_nsec=0, st_ctime=1 /* 1970-01-01T10:00:01.536999999+1000 */, st_ctime_nsec=536999999}, AT_SYMLINK_NOFOLLOW) = 0
19866 13:10:22 newfstatat(AT_FDCWD, "..", {st_dev=makedev(0, 1), st_ino=1, st_mode=S_IFDIR|0755, st_nlink=14, st_uid=1000, st_gid=1000, st_blksize=4096, st_blocks=0, st_size=0, st_atime=1622885558 /* 2021-06-05T19:32:38+1000 */, st_atime_nsec=0, st_mtime=1622885558 /* 2021-06-05T19:32:38+1000 */, st_mtime_nsec=0, st_ctime=1 /* 1970-01-01T10:00:01.536999999+1000 */, st_ctime_nsec=536999999}, AT_SYMLINK_NOFOLLOW) = 0
19866 13:10:22 chroot(".")              = 0
19866 13:10:22 newfstatat(AT_FDCWD, "/", {st_dev=makedev(0, 1), st_ino=1, st_mode=S_IFDIR|0755, st_nlink=14, st_uid=1000, st_gid=1000, st_blksize=4096, st_blocks=0, st_size=0, st_atime=1622885558 /* 2021-06-05T19:32:38+1000 */, st_atime_nsec=0, st_mtime=1622885558 /* 2021-06-05T19:32:38+1000 */, st_mtime_nsec=0, st_ctime=1 /* 1970-01-01T10:00:01.536999999+1000 */, st_ctime_nsec=536999999}, 0) = 0

Not sure if this is helpful. but on this final line, it states st_ino=1 for / when the inode of my root directory is 6967440... here is the output of stat /:

  File: /
  Size: 4096        Blocks: 16         IO Block: 4096   directory
Device: b302h/45826d    Inode: 6967440     Links: 23
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2022-05-11 17:03:52.620551274 +1000
Modify: 2022-05-11 16:47:05.966999976 +1000
Change: 2022-05-11 16:47:05.966999976 +1000
 Birth: -

EDIT: inode 1 points to my fat32 partition /dev/mmcblk0p1, which includes the binaries required for booting from the Switch's RCM as well as the initramfs (tree is below) however my root is on a secondary ext4 partition /dev/mmcblk0p2.

/media/mezza/ADB5-5FB9
├── bootloader
│   ├── hekate_ipl.ini
│   ├── ini
│   │   └── L4T-bionic.ini
│   ├── nyx.ini
│   ├── payloads
│   ├── res
│   │   ├── icon_payload.bmp
│   │   └── icon_switch.bmp
│   ├── sys
│   │   ├── emummc.kipm
│   │   ├── libsys_lp0.bso
│   │   ├── libsys_minerva.bso
│   │   ├── nyx.bin
│   │   ├── res.pak
│   │   └── thk.bin
│   └── update.bin
└── switchroot
    ├── install
    │   ├── l4t.00
    │   └── l4t.01
    └── ubuntu
        ├── bootlogo_ubuntu.bmp
        ├── boot.scr
        ├── coreboot.rom
        ├── icon_ubuntu_hue.bmp
        ├── Image
        ├── initramfs
        ├── overlays
        │   ├── nfs.txt
        │   ├── tegra210-icosa_emmc-overlay.dtbo
        │   └── tegra210-icosa-UART-B-overlay.dtbo
        ├── tegra210-icosa.dtb
        ├── uenv_readme.txt
        └── uenv.txt
paradigm commented 2 years ago

The way 0.7 Poki normally works:

(For anyone from the future, note just about all of this is expected to change in 0.8 Naga)

My only guess is your effort to get Bedrock on the Switch resulted in breaking expectations around the first two bullet points above. Maybe bedrock isn't on the root of the offline filesystem tree, or maybe you tweaked Bedrock's boot stuff such that it doesn't do the pivot_root to the selected init stratum. If one of those are the case, hopefully the above explanation is sufficient for you to see how to rework things closer to expectations. If neither of those guesses are the case, I'm at a loss on how to debug this remotely. You may have to poke a bit more on your end to uncover another lead.

If we don't figure this out and you throw in the towel, do note 0.8 plans include (re)introducing a non-hijack option that may be easier to get onto hardware that isn't amenable to hijacking. Consider revisiting this effort then.

firejoust commented 2 years ago

This was an issue with the initramfs; it seems to be working perfectly fine now after this commit: https://gitlab.azka.li/l4t-community/kernel/l4t-initramfs/-/commit/f4b8c60972bf8f9b5b25748a4cb98e29e0d20d86

I appreciate your help, I'll let you know if anything else goes wrong in the meantime!

paradigm commented 2 years ago

Nice, happy to hear it's now working :)