OSInside / kiwi

KIWI - Appliance Builder Next Generation
https://osinside.github.io/kiwi
GNU General Public License v3.0
306 stars 152 forks source link

While building in an nspawn container loop device partitions are not found #2293

Open PhilipSnell opened 1 year ago

PhilipSnell commented 1 year ago

Problem description

I am trying to build in a nspawn container and it is failing to find the loop partitions

[ DEBUG   ]: 22:40:03 | Initialize gpt disk
[ DEBUG   ]: 22:40:03 | EXEC: [sgdisk --zap-all /dev/loop0]
[ INFO    ]: 22:40:04 | --> creating EFI CSM(legacy bios) partition
[ DEBUG   ]: 22:40:04 | EXEC: [sgdisk -n 1:2048:+2M -c 1:p.legacy /dev/loop0]
[ DEBUG   ]: 22:40:05 | EXEC: [sgdisk -t 1:EF02 /dev/loop0]
[ INFO    ]: 22:40:07 | --> creating EFI partition
[ DEBUG   ]: 22:40:07 | EXEC: [sgdisk -n 2:0:+20M -c 2:p.UEFI /dev/loop0]
[ DEBUG   ]: 22:40:08 | EXEC: [sgdisk -t 2:EF00 /dev/loop0]
[ INFO    ]: 22:40:09 | --> creating SWAP partition
[ DEBUG   ]: 22:40:09 | EXEC: [sgdisk -n 3:0:+128M -c 3:p.swap /dev/loop0]
[ DEBUG   ]: 22:40:10 | EXEC: [sgdisk -t 3:8200 /dev/loop0]
[ INFO    ]: 22:40:11 | --> using all_freeMB for the root(rw) partition if present
[ INFO    ]: 22:40:11 | --> creating root partition [with 0 clone(s)]
[ DEBUG   ]: 22:40:11 | EXEC: [sgdisk -n 4:0:0 -c 4:p.lxroot /dev/loop0]
[ DEBUG   ]: 22:40:12 | EXEC: [sgdisk -t 4:8300 /dev/loop0]
[ DEBUG   ]: 22:40:13 | EXEC: [partx --add /dev/loop0]
[ ERROR   ]: 22:40:13 | KiwiMappedDeviceError: Device /dev/loop0p1 does not exist

Expected behaviour

Loop partitions should be created and formatted properly during the build process

Steps to reproduce the behaviour

OS and Software information

ignatenkobrain commented 1 year ago

The same happens for me (via mock/koji/etc.)

DEBUG util.py:535:  Using nspawn with args ['--capability=cap_ipc_lock', '--bind=/tmp/mock-resolv.zoubxl9y:/etc/resolv.conf', '--bind=/dev/btrfs-control', '--bind=/dev/mapper/control', '--bind=/dev/loop-control', '--bind=/dev/loop0', '--bind=/dev/loop1', '--bind=/dev/loop2', '--bind=/dev/loop3', '--bind=/dev/loop4', '--bind=/dev/loop5', '--bind=/dev/loop6', '--bind=/dev/loop7', '--bind=/dev/loop8', '--bind=/dev/loop9', '--bind=/dev/loop10', '--bind=/dev/loop11']
DEBUG util.py:445:  [ INFO    ]: 15:16:00 | Creating raw disk image /builddir/result/image/gdc-c9s-Cloud.x86_64-0.0.2.raw
DEBUG util.py:445:  [ INFO    ]: 15:16:01 | --> using all_freeMB for the root(rw) partition if present
DEBUG util.py:445:  [ INFO    ]: 15:16:01 | --> creating root partition [with 0 clone(s)]
DEBUG util.py:445:  [ INFO    ]: 15:16:01 | --> setting active flag to primary boot partition
DEBUG util.py:445:  [ INFO    ]: 15:16:01 | --> setting start sector to: 2048
DEBUG util.py:443:  [ ERROR   ]: 15:16:01 | KiwiCommandError: partx: stderr: partx: /dev/loop1: error adding partition 1
DEBUG util.py:443:  , stdout: (no output on stdout)
DEBUG util.py:445:  [ INFO    ]: 15:16:01 | Cleaning up LoopDevice instance
ignatenkobrain commented 1 year ago
With `--debug` ``` DEBUG util.py:445: [ INFO ]: 15:34:06 | --> creating root partition [with 0 clone(s)] DEBUG util.py:445: [ DEBUG ]: 15:34:06 | p.lxroot: fdisk: n p 1 cur_position +all_freeM w q DEBUG util.py:445: [ DEBUG ]: 15:34:06 | EXEC: [bash -c cat /var/tmp/kiwi_968emp8u | fdisk /dev/loop1] DEBUG util.py:445: [ DEBUG ]: 15:34:06 | EXEC: Failed with stderr: Re-reading the partition table failed.: Invalid argument DEBUG util.py:445: , stdout: DEBUG util.py:445: Welcome to fdisk (util-linux 2.37.4). DEBUG util.py:445: Changes will remain in memory only, until you decide to write them. DEBUG util.py:445: Be careful before using the write command. DEBUG util.py:445: Device does not contain a recognized partition table. DEBUG util.py:445: Created a new DOS disklabel with disk identifier 0x0c829e3d. DEBUG util.py:445: Command (m for help): Partition type DEBUG util.py:445: p primary (0 primary, 0 extended, 4 free) DEBUG util.py:445: e extended (container for logical partitions) DEBUG util.py:445: Select (default p): Partition number (1-4, default 1): First sector (2048-2566143, default 2048): Last sector, +/-sectors or +/-size{K,M,G,T,P} (2048-2566143, default 2566143): DEBUG util.py:445: Created a new partition 1 of type 'Linux' and of size 1.2 GiB. DEBUG util.py:445: Command (m for help): The partition table has been altered. DEBUG util.py:445: Calling ioctl() to re-read partition table. DEBUG util.py:445: The kernel still uses the old table. The new table will be used at the next reboot or after you run partprobe(8) or partx(8). DEBUG util.py:445: [ DEBUG ]: 15:34:06 | potential fdisk errors were ignored DEBUG util.py:445: [ DEBUG ]: 15:34:06 | EXEC: [sfdisk -c /dev/loop1 1 83] DEBUG util.py:445: [ INFO ]: 15:34:06 | --> setting active flag to primary boot partition DEBUG util.py:445: [ DEBUG ]: 15:34:06 | EXEC: [parted /dev/loop1 set 1 boot on] DEBUG util.py:445: [ INFO ]: 15:34:06 | --> setting start sector to: 2048 DEBUG util.py:445: [ DEBUG ]: 15:34:06 | EXEC: [lsblk -r -o NAME,TYPE /dev/loop1] DEBUG util.py:445: [ DEBUG ]: 15:34:06 | fdisk: d 1 n p 1 2048 w q DEBUG util.py:445: [ DEBUG ]: 15:34:06 | EXEC: [bash -c cat /var/tmp/kiwi_ftq9desi | fdisk /dev/loop1] DEBUG util.py:445: [ DEBUG ]: 15:34:06 | EXEC: Failed with stderr: 1: unknown command DEBUG util.py:445: Re-reading the partition table failed.: Invalid argument DEBUG util.py:445: , stdout: DEBUG util.py:445: Welcome to fdisk (util-linux 2.37.4). DEBUG util.py:445: Changes will remain in memory only, until you decide to write them. DEBUG util.py:445: Be careful before using the write command. DEBUG util.py:445: Command (m for help): Selected partition 1 DEBUG util.py:445: Partition 1 has been deleted. DEBUG util.py:445: Command (m for help): DEBUG util.py:445: Command (m for help): Partition type DEBUG util.py:445: p primary (0 primary, 0 extended, 4 free) DEBUG util.py:445: e extended (container for logical partitions) DEBUG util.py:445: Select (default p): Partition number (1-4, default 1): First sector (2048-2566143, default 2048): Last sector, +/-sectors or +/-size{K,M,G,T,P} (2048-2566143, default 2566143): DEBUG util.py:445: Created a new partition 1 of type 'Linux' and of size 1.2 GiB. DEBUG util.py:445: Command (m for help): The partition table has been altered. DEBUG util.py:445: Calling ioctl() to re-read partition table. DEBUG util.py:445: The kernel still uses the old table. The new table will be used at the next reboot or after you run partprobe(8) or partx(8). DEBUG util.py:445: [ DEBUG ]: 15:34:06 | potential fdisk errors were ignored DEBUG util.py:445: [ DEBUG ]: 15:34:06 | EXEC: [partx --add /dev/loop1] DEBUG util.py:445: [ DEBUG ]: 15:34:06 | EXEC: Failed with stderr: partx: /dev/loop1: error adding partition 1 DEBUG util.py:445: , stdout: (no output on stdout) DEBUG util.py:443: [ ERROR ]: 15:34:06 | KiwiCommandError: partx: stderr: partx: /dev/loop1: error adding partition 1 DEBUG util.py:443: , stdout: (no output on stdout) DEBUG util.py:445: [ INFO ]: 15:34:06 | Cleaning up LoopDevice instance DEBUG util.py:445: [ DEBUG ]: 15:34:06 | EXEC: [losetup -d /dev/loop1] DEBUG util.py:445: [ INFO ]: 15:34:06 | Cleaning up BootImageDracut instance DEBUG util.py:445: [ INFO ]: 15:34:06 | Cleaning up BootLoaderConfigGrub2 instance ```
ignatenkobrain commented 1 year ago

I've tried to add --verbose to partx:

[ DEBUG   ]: 16:58:36 | EXEC: [partx -v --add /dev/loop0]
[ DEBUG   ]: 16:58:36 | EXEC: Failed with stderr: partx: /dev/loop0: adding partition #1 failed: Device or resource busy
partx: /dev/loop0: error adding partition 1
, stdout: partition: none, disk: /dev/loop0, lower: 0, upper: 0
/dev/loop0: partition table type 'dos' detected
range recount: max partno=1, lower=0, upper=0
ignatenkobrain commented 1 year ago

Trying it with strace:

ioctl(3, BLKPG, {op=BLKPG_ADD_PARTITION, flags=0, datalen=152, data={start=1048576, length=1142947840, pno=1, devname="", volname=""}}) = -1 EBUSY (Device or resource busy)
ignatenkobrain commented 1 year ago

Running sudo partx -d /dev/loop0p1 from host, and then running the relevant partx --add from container does the trick…

ignatenkobrain commented 1 year ago
[ INFO    ]: 17:23:30 | --> setting active flag to primary boot partition
[ DEBUG   ]: 17:23:30 | EXEC: [parted /dev/loop0 set 1 boot on]

this command makes the /dev/loop0p1 appear on a host system but not inside of nspawn…

ignatenkobrain commented 1 year ago

Would it make sense to switch from loop devices and all this mumbo-jumbo to using nbdkit? Not sure how exactly but as I understand it correctly, it should have similar capabilities and would not cause same issues as we have currently in this ticket.

schaefi commented 1 year ago

To me this sounds more like a partx issue inside of the container not running udev for device creation ? just guessing. So switching to the old device mapper based kpartx should imho solve your problem. You can try this by making sure your container provides a kiwi config file /etc/kiwi.yml with the following setting

mapper:
  - part_mapper: kpartx

kpartx creates the devices in /dev/mapper and doesn't need extra services or events to do the job

maybe worth a try

ignatenkobrain commented 1 year ago

@schaefi with kpartx, there are issues inside of systemd-nspawn (but different set of them)...

schaefi commented 1 year ago

@schaefi with kpartx, there are issues inside of systemd-nspawn (but different set of them)...

Hmm, ok so you think nbdkit would solve this ? I haven't looked into this and sorry for the late reply. I'm afraid there will be no free slot for me to work on this topic

Conan-Kudo commented 11 months ago

@rwmjones is the main developer of nbdkit, and he's written some blog posts on it that are quite helpful in understanding it. It seems one of the bigger advantages of nbdkit would be to be able to do everything unprivileged. We might want to also explore ublk for similar reasons...

That said, it seems like the issue has to do with what happens with automatically enumerated partitions from a loop device, which is a kernel thing, not a partx thing.

rwmjones commented 11 months ago

ublk unfortunately won't let you do stuff unprivileged.

nbdkit will let you create a disk image with a partition, but I'm not sure it really solves what you're trying to do here. Nevertheless these plugins could be interesting: https://libguestfs.org/nbdkit-linuxdisk-plugin.1.html https://libguestfs.org/nbdkit-floppy-plugin.1.html

libguestfs will run up a qemu instance and let you run commands inside, which is all unprivileged from the point of view of the host.

Conan-Kudo commented 10 months ago

It looks like loopfs would help based on this LWN article, but I can't figure out what happened to it..

Conan-Kudo commented 10 months ago

@brauner, I can't seem to figure out what happened to loopfs, could you tell us what happened to it? It would be tremendously useful for us if we had it...

brauner commented 10 months ago

So loopfs as a concept is mostly dead because the block people didn't like it being a filesystem. And I think that they're right. I have a plan to implement something that isn't tied to a filesystem at all. So basically a purely fd based API possibly.

itoffshore commented 9 months ago
mapper:
  - part_mapper: kpartx

tumble-dev

Conan-Kudo commented 3 months ago

FYI @keszybz, this is the kiwi/nspawn issue we discussed at Flock.