Closed iluminae closed 2 years ago
Can you provide the full logs of the failing boot?
Install appears fine, triggers reboot, boots with no output and reboots again, then boots into an error
The double reboot here is expected. During the first reboot, no logs are available because the console kargs haven't been added yet. Very early, Ignition applies them and immediately reboots again. You can avoid the double reboot by having coreos-installer itself perform the karg modifications (using pxe customize
for example). (It'd make sense to have it automatically apply kargs from the target Ignition config as an optimization. I think this was discussed $somewhere but I can't find it right now; will look for it and otherwise file an RFE. Edit: https://github.com/coreos/coreos-installer/issues/797)
Expected one vendor dir
This error comes from the first boot trying to "bind" the bootloader with the bootfs. The error message here is not super helpful (improvement in https://github.com/coreos/coreos-installer/pull/796), but it's saying that there are multiple directories in the EFI partition. This is not the case in the FCOS EFI partition.
Are there any other disks connected to the systems with a boot
filesystem label? E.g. a previous OS installation. Usually, the error message in that condition should've been clearer, so it's possible there's something more subtle going on here.
Yea I figured the double boot was not the issue. There are 3 disks I have attached to the system, but I have also tried completely disconnecting the other 2 before boot, same result.
I will try to get the logs, I am on a iDRAC with serial redirection and they look pretty rough.
From the emergency shell in the initrd, I see only the labels made by the installer:
:/root# ls /dev/disk/by-label/ -l
total 0
lrwxrwxrwx 1 root root 10 Mar 8 02:43 EFI-SYSTEM -> ../../sda2
lrwxrwxrwx 1 root root 10 Mar 8 02:43 boot -> ../../sda3
lrwxrwxrwx 1 root root 10 Mar 8 02:43 root -> ../../sda4
:/root# ls /dev/disk/by-partlabel/ -l
total 0
lrwxrwxrwx 1 root root 10 Mar 8 02:43 BIOS-BOOT -> ../../sda1
lrwxrwxrwx 1 root root 10 Mar 8 02:43 EFI-SYSTEM -> ../../sda2
lrwxrwxrwx 1 root root 10 Mar 8 02:43 boot -> ../../sda3
lrwxrwxrwx 1 root root 10 Mar 8 02:43 root -> ../../sda4
We have 3 partitions there related to booting, and the error is referring only to /dev/sda2. I have tried booting the system both with EFI and with BIOS booting, same outcome.
Yea I figured the double boot was not the issue. There are 3 disks I have attached to the system, but I have also tried completely disconnecting the other 2 before boot, same result.
Interesting, thanks for testing that.
I will try to get the logs, I am on a iDRAC with serial redirection and they look pretty rough.
From the emergency shell in the initrd, I see only the labels made by the installer:
:/root# ls /dev/disk/by-label/ -l total 0 lrwxrwxrwx 1 root root 10 Mar 8 02:43 EFI-SYSTEM -> ../../sda2 lrwxrwxrwx 1 root root 10 Mar 8 02:43 boot -> ../../sda3 lrwxrwxrwx 1 root root 10 Mar 8 02:43 root -> ../../sda4 :/root# ls /dev/disk/by-partlabel/ -l total 0 lrwxrwxrwx 1 root root 10 Mar 8 02:43 BIOS-BOOT -> ../../sda1 lrwxrwxrwx 1 root root 10 Mar 8 02:43 EFI-SYSTEM -> ../../sda2 lrwxrwxrwx 1 root root 10 Mar 8 02:43 boot -> ../../sda3 lrwxrwxrwx 1 root root 10 Mar 8 02:43 root -> ../../sda4
We have 3 partitions there related to booting, and the error is referring only to /dev/sda2. I have tried booting the system both with EFI and with BIOS booting, same outcome.
From the emergency shell, can you mount /dev/sda2
somewhere and show the output of ls $mnt/EFI
?
Here you go:
:/root# find mnt
mnt
mnt/EFI
mnt/EFI/BOOT
mnt/EFI/BOOT/BOOTX64.EFI
mnt/EFI/BOOT/fbx64.efi
mnt/EFI/fedora
mnt/EFI/fedora/BOOTX64.CSV
mnt/EFI/fedora/shim.efi
mnt/EFI/fedora/shimx64.efi
mnt/EFI/fedora/grubx64.efi
mnt/EFI/fedora/mmx64.efi
mnt/EFI/fedora/grub.cfg
mnt/EFI/Dell
mnt/EFI/Dell/BootOptionCache
mnt/EFI/Dell/BootOptionCache/BootOptionCache.dat
Sorry I've had a silly time getting the boot log off the box, I just need to find a USB stick or something.
EDIT:
I checked the same on the 2 servers that did work and yes, they are missing the Dell
directory.
# on _working_ boxes
# find mnt/
mnt/
mnt/EFI
mnt/EFI/BOOT
mnt/EFI/BOOT/BOOTX64.EFI
mnt/EFI/BOOT/fbx64.efi
mnt/EFI/fedora
mnt/EFI/fedora/BOOTX64.CSV
mnt/EFI/fedora/shim.efi
mnt/EFI/fedora/shimx64.efi
mnt/EFI/fedora/grubx64.efi
mnt/EFI/fedora/mmx64.efi
mnt/EFI/fedora/grub.cfg
mnt/EFI/fedora/bootuuid.cfg
So - what adds that?
I have tried to just delete that EFI/Dell
directory from /dev/sda2 but it comes back every time it reboots. This is not some magic dell thing is it?
mnt/EFI/Dell mnt/EFI/Dell/BootOptionCache mnt/EFI/Dell/BootOptionCache/BootOptionCache.dat
Ahh nice. Yup, this is the issue. We currently don't expect that.
This is not some magic dell thing is it?
Information on this is really scarce, so I can't say for sure, but yes it does smell a lot like a magic Dell thing.
Anyway, we should be more lax on our side here given this information, so I'll look at tweaking our heuristics.
Anyway, we should be more lax on our side here given this information, so I'll look at tweaking our heuristics.
This issue is also happening on Dell PowerEdge R630 and R640.
hey @jlebon any word on when coreos/coreos-installer#802 is getting in? I had to fall back to a different OS for this class of host and I would like to get them all homogenous.
Hi @iluminae, apologies for the delay. We had some CI issues but they should be fixed now. Once the patch is in, I'll make sure it ends up in next week's releases.
The fix for this went into testing
stream release 35.20220327.2.0
. Please try out the new release and report issues.
Issue occurring on FC640 and MX740c as well but tested with testing release mentioned on FC640 and seems to be fixed.
Thanks for reporting the info and success @log1cb0mb!
The fix for this went into stable
stream release 35.20220327.3.0
.
Describe the bug After install via PXE on some host models, encountering:
Error: Expected one vendor dir on /dev/sda2, got 2
after reboot. I have 4 hosts, 2 poweredge r415 and 2 poweredge r210 II. This issue is happening on both r210s, but the same process works on the r415s.Reproduction steps Steps to reproduce the behavior:
console=ttyS1,115200n8 coreos.live.rootfs_url=%s coreos.inst.install_dev=%s coreos.inst.ignition_url=%s
Expected behavior After reboot, I expect it to boot.
Actual behavior Install appears fine, triggers reboot, boots with no output and reboots again, then boots into an error:
coreos-boot-edit[900]: Error: Expected one vendor dir on /dev/sda2, got 2
.System details
Ignition config