siderolabs / image-factory

A service to generate Talos boot assets
Mozilla Public License 2.0
60 stars 16 forks source link

RAW image build fails #88

Closed garzdin-schwarz closed 8 months ago

garzdin-schwarz commented 8 months ago

The image-factory builds ISOs fine. But when requesting a different type of image like a RAW image or QCOW2 it fails with the following error: failed to setup loopback device: exit status 1: losetup: cannot find an unused loop device\n

When looking at the logs of the image-factory I see:

{"level":"info","ts":1707388772.0324736,"caller":"asset/asset.go:231","msg":"building image asset","component":"asset-builder","profile":{"BaseProfileName":"","Arch":"amd64","Platform":"metal","Board":"","SecureBoot":null,"Version":"v1.6.4","Customization":{"ExtraKernelArgs":null,"MetaContents":[{"Key":10,"Value":"{}"}]},"Input":{"Kernel":{"Path":""},"Initramfs":{"Path":""},"SDStub":{"Path":""},"SDBoot":{"Path":""},"DTB":{"Path":""},"UBoot":{"Path":""},"RPiFirmware":{"Path":""},"BaseInstaller":{"ImageRef":"","ForceInsecure":false,"TarballPath":"","OCIPath":""},"SecureBoot":null,"SystemExtensions":[{"ImageRef":"","ForceInsecure":false,"TarballPath":"/tmp/image-factory1959283312/schematics/de3de71abf5a50551986885ddf49ce25e86459a502da360255704ecffdec2e35.tar","OCIPath":""}]},"Output":{"Kind":"image","ImageOptions":{"DiskSize":1306525696,"DiskFormat":"qcow2","DiskFormatOptions":""},"OutFormat":"raw"}},"version":"1.6.4","concurrency_latency":0.000001324}
profile ready:
arch: amd64
platform: metal
secureboot: null
version: v1.6.4
customization:
  metaContents:
    - key: 10
      value: '{}'
input:
  kernel:
    path: /tmp/image-factory1959283312/v1.6.4/amd64/vmlinuz
  initramfs:
    path: /tmp/image-factory1959283312/v1.6.4/amd64/initramfs.xz
  baseInstaller:
    imageRef: ghcr.io/siderolabs/installer:v1.6.4
  systemExtensions:
    - imageRef: ""
      tarballPath: /tmp/image-factory1959283312/schematics/de3de71abf5a50551986885ddf49ce25e86459a502da360255704ecffdec2e35.tar
output:
  kind: image
  imageOptions:
    diskSize: 1306525696
    diskFormat: qcow2
  outFormat: raw
rebuilding initramfs with system extensions...
    copying /tmp/image-factory1959283312/v1.6.4/amd64/initramfs.xz to /tmp/imager4191863863/initramfs.xz
rebuilding initramfs with system extensions...
    extracting /tmp/image-factory1959283312/schematics/de3de71abf5a50551986885ddf49ce25e86459a502da360255704ecffdec2e35.tar...
rebuilding initramfs with system extensions...
    discovered system extensions:
rebuilding initramfs with system extensions...
    NAME        VERSION                                                            AUTHOR
rebuilding initramfs with system extensions...
    schematic   de3de71abf5a50551986885ddf49ce25e86459a502da360255704ecffdec2e35   Image Factory
rebuilding initramfs with system extensions...
    validating system extensions
rebuilding initramfs with system extensions...
    compressing system extensions
rebuilding initramfs with system extensions...
    creating system extensions initramfs archive and compressing it
initramfs ready
kernel command line: talos.platform=metal console=ttyS0 console=tty0 init_on_alloc=1 slab_nomerge pti=on consoleblank=0 nvme_core.io_timeout=4294967295 printk.devkmsg=on ima_template=ima-ng ima_appraise=fix ima_hash=sha512
creating disk image...
    creating raw disk of size 1.3 GB
creating disk image...
    attaching loopback device
{"level":"info","ts":1707388772.1857173,"caller":"http/http.go:163","msg":"request","frontend":"http","method":"GET","path":"/image/de3de71abf5a50551986885ddf49ce25e86459a502da360255704ecffdec2e35/v1.6.4/metal-amd64.qcow2","error":"error generating asset: failed to setup loopback device: exit status 1: losetup: cannot find an unused loop device\n"}

Executing modprobe loop inside of the running image-factory container errors out with:

image-factory-6ddd6584-57bf8:/# modprobe loop
modprobe: FATAL: Module loop not found in directory /lib/modules/6.1.73-flatcar
smira commented 8 months ago

This is not a bug, it's a problem on the way you run Image Factory, which is not described in the issue.

Image Factory does need to access loop devices at the moment, so if you don't give it access to it, it won't work.

Running modprobe from a container makes zero sense, run it on the host.

Make sure losetup works for you on the host first, then move towards ensuring that the container is privileged and has access to /dev.

In Kubernetes this would be:

        securityContext:
          privileged: true
      volumes:
      - name: host-dev
        hostPath:
          path: /dev
garzdin-schwarz commented 8 months ago

@smira That's fine, but it's not documented anywhere. Also it doesn't work at https://factory.talos.dev :)

Screenshot 2024-02-08 at 13 33 05
smira commented 8 months ago

Image Factory works, there's no .qcow2.tar.gz, and qcow2 is not directly available for metal (but that's not in the Image Factory).

garzdin-schwarz commented 8 months ago

So is it the imager then? Not sure I understand, because after looking at the code qcow2 images are available and I've successfully converted a raw metal image from the image-factory to qcow2 with qemu-img, which the imager also uses under the hood. And it works when booted from.

And as far as I understand the .tar.gz extension is supported as that only compresses the output image. But even without it this should work.

smira commented 8 months ago

metal in fact is a standard imager profile which is in Talos, but this doesn't configure QCOW2 settings, so it won't generate QCOW2 image. You can still download raw and convert, but if that needs to be supported, the support should start with Talos.

smira commented 8 months ago

So we should rather collect what we're looking for - e.g. we want qcow2 on the fly, and create a feature request for it.

Or you don't know how to deploy - create a request for it, start by sharing your steps, helm charts, etc.

That would be more helpful.

garzdin-schwarz commented 8 months ago

metal in fact is a standard imager profile which is in Talos, but this doesn't configure QCOW2 settings, so it won't generate QCOW2 image. You can still download raw and convert, but if that needs to be supported, the support should start with Talos.

What do you mean that the support should start with Talos? There's already QCOW2-formatted images for Exoscale and Oracle. For the Metal platform it'd be dependant on what hypervisor you run it on. In my case I'm running it on QEMU, so QCOW2 images are fully supported.

To me it seems like this is only an artificial limition of the imager. I tried supplying a custom profile to it:

arch: amd64
platform: metal
secureboot: false
version: v1.6.1
input:
  kernel:
    path: /usr/install/amd64/vmlinuz
  initramfs:
    path: /usr/install/amd64/initramfs.xz
  baseInstaller:
    imageRef: ghcr.io/siderolabs/installer:v1.6.1
output:
  kind: image
  imageOptions:
    diskSize: 1306525696
    diskFormat: qcow2
  outFormat: .tar.gz

But that didn't output a valid QCOW2 image.

smira commented 8 months ago

You're moving in the right direction, qcow2 requires more option, look for qcow profiles. Either way, if you see something missing, please open a feature request, we will look into that.