CCI-MOC / esi

Elastic Secure Infrastructure project
6 stars 13 forks source link

MOC-R8PAC23U31 will no longer provision any image other than rhelai #629

Closed tzumainn closed 1 month ago

tzumainn commented 1 month ago

After provisioning this node with moc-rhelai-nvidia-1.1, it will no longer successfully provision any other image. The console shows messages like couldn't find an EFI compatible partition on /dev/sda before giving up. However, using the rhelai image still works.

tzumainn commented 1 month ago

MOC-R8PAC23U26 and MOC-R8PAC23U40 also have this behavior; I think I may have tested the latter with a rhelai image, but I am not positive.

larsks commented 1 month ago

@tzumainn it looks like the moc-rhelai-nvidia-1.1 image includes an EFI system partition, while most of the other images do not.


Here's how I am inspecting the images:

  1. Download the image:

    openstack image save --file moc-rhelai-nvidia-1.1.img moc-rhelai-nvidia-1.1
  2. These are mostly qcow2 images, so we can't operate on them directly. We need to present that as block devices, which we can do by attaching them to an [nbd] device. You may first need to load the nbd module:

    modprobe nbd

    Then attach the image:

    qemu-nbd -c /dev/nbd0 moc-rhelai-nvidia-1.1.img

    We now have a block device /dev/nbd0 and associated partition devices, like /dev/nbd0p1.

  3. We can inspect the partition map of these images using fdisk:

    fdisk -l /dev/nbd0

    Which for the moc-rhelai-nvidia-1.1.img image produces:

    Disk /dev/nbd0: 54.54 GiB, 58566115328 bytes, 114386944 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disklabel type: gpt
    Disk identifier: D209C89E-EA5E-4FBD-B161-B461CCE297E0
    
    Device        Start       End   Sectors  Size Type
    /dev/nbd0p1    2048      4095      2048    1M BIOS boot
    /dev/nbd0p2    4096   1030143   1026048  501M EFI System
    /dev/nbd0p3 1030144   3127295   2097152    1G Linux filesystem
    /dev/nbd0p4 3127296 114386910 111259615 53.1G Linux filesystem
  4. When you're done, detach the image:

    qemu-nbd -d /dev/nbd0
tzumainn commented 1 month ago

Looks like we were just using images incompatible with UEFI boot, as an older centos-image worked. Closing this; I'll be opening issues for the ESI team regarding uploading new ubuntu and centos9-stream images