rancher / elemental-toolkit

:snowflake: The toolkit to build, ship and maintain cloud-init driven Linux derivatives based on container images
https://rancher.github.io/elemental-toolkit/docs/
Apache License 2.0
291 stars 52 forks source link

Booting from network #403

Closed mudler closed 3 years ago

mudler commented 3 years ago

Is your feature request related to a problem? Please describe. There is no other way now to boot cOS from baremetal except for ISO. This might be limiting in case of edge scenarios

Describe the solution you'd like A way for cOS to boot up from network, e.g. PXE boot

Describe alternatives you've considered N/A, this card is also for discussing

Additional context Try documentation on https://netboot.xyz/ or fork it to add cOS support, booting via squashfs.

See also: https://netboot.xyz/selfhosting/, https://github.com/harvester/ipxe-examples Timebox 2 days

davidcassany commented 3 years ago

I did a quick check and reading on this netboot.xyz project. I still might be wrong since I did a really quick exploration of the repo templates and a quick boot tests of k3os on my libvirt env. I have the feeling we need in any case to provide the initrd and kernel as separate artifacts. Moreover the initrd needs to include the logic to fetch the remote system and boot (this should be pretty straight forward since livenet upstream dracut module already provides support for remote live boots).

I had a look to the roles/netbootxyz/templates/menu/live-debian.ipxe.j2 because it uses an squashfs image and seams to be pretty simple. Note the squashfs image is a parater passed to the kernel command line, and the kernel is directly fetched from a remote URL as so the initrd. Other I looked follow a similar approach.

Anyway, what I mean is that I doubt we can use netboot.xyz without first providing the kernel, the initrd and a livenet boot procedure in cOS. None of this is a complex thing, but I'd say this is a required pre-task to actually elaborate about an eventual integration within netboot.xyz project.

dragonchaser commented 3 years ago

I found this project: https://github.com/FOGProject/fogproject this might also be worth taking a look at.

kkaempf commented 3 years ago

Hmm, fogproject seems to focus more on the management and UI side of things. We need just an initial bootstrap solution for bare-metal machines (on the edge).

While PXE is a vendor-neutral solution, it's less used in customer scenarios due to security concerns. Redfish is a newer management protocol with widespread adaption.

Some IHVs have their own solutions to this problem:

dragonchaser commented 3 years ago

@kkaempf you are right, we can probably ignore that.

I made some progress booting to a local netboot.xyz instance using the ipxe iso from https://github.com/ipxe/ipxe on virtualbox and qemu, I think we can built upon that. The iso itself is just 1MiB and could be easily fed to the system as virtual devices through the management engines you mentioned.

dragonchaser commented 3 years ago

Findings:

Netboot can be achieved through redfish, a virtual media can be inserted using a library like https://opendev.org/airship/go-redfish as a generic approach, there are tools like ilorest available from the vendors that achieve the same, and are probably more aligned to the actual implementation of the vendor, but has to be implemented on a per vendor basis which is slightly more effort.

So far I have not found a solution yet for passing a kernel parameter to the machine (e.g. cos.setup=<URL>) to configure/install the system in question.

I am trying to find a way to maybe use netboot.xyz to dynamically configure the boot loader. If that does not work we might have to craft/configure isos as needed (e.g. replace the /boot/grub/grub.cfgusing luet). IPXE would offer that feature: https://ipxe.org/howto/rh_san, redfish does not support kernel params.

Issues

bk201 commented 3 years ago

I'm working PXE boot to install Harvester on RancherOSv2. The following is how I boot the live system:

mudler commented 3 years ago

That's great! Thanks a lot @bk201 !

dragonchaser commented 3 years ago

Updated the docs, pr: https://github.com/rancher-sandbox/cos-toolkit-docs/pull/18

dragonchaser commented 3 years ago

DOC is here: https://rancher-sandbox.github.io/cos-toolkit-docs/docs/getting-started/booting/#booting-from-network

kkaempf commented 3 years ago

I thought I leave this here for now: https://blog.alexellis.io/state-of-netbooting-raspberry-pi-in-2021 :wink: