xcp-ng / xcp

Entry point for issues and wiki. Also contains some scripts and sources.
https://xcp-ng.org
1.26k stars 74 forks source link

iPXE UEFI support #482

Open olivierlambert opened 3 years ago

olivierlambert commented 3 years ago

We should be able to boot XCP-ng installer with iPXE with UEFI enabled. We might have to chainload to grub then installer itself.

pietrushnic commented 1 year ago

hi @olivierlambert, this would be really great if one want to do unattended installation on Dasharo (coreboot+UEFI) - for example Protectli hardware.

Right now there is not way to do unattended installation of xcp-ng over network with UEFI. I see two strategies how this could be implemented:

stormi commented 1 year ago

It's a bit more convoluted, but you can already put grubx64.efi from the installation ISO on a TFTP server, and put a EFI/xenserver/grub.cfg configuration file on the same TFTP server.

Then this grub.cfg can either contain the configuration you want, or load another configuration, that might be dynamically generated based on the booting hardware's MAC address or IP: configfile (http,pxe)/bootconfig.php?arch=uefi.

pietrushnic commented 1 year ago

But I have to manually type configfile ... command

stormi commented 1 year ago

Can you describe your use case with more details?

pietrushnic commented 1 year ago

I would like to provision multiple xcp-ng instances in unattended manner over network. I guess last resort I can do that with custom ISO, but I would like to avoid that since in long run it would be less maintainable.

Our hardware use open-source firmware distribution called Dasharo, which consist of coreboot with UEFI payload. As network stack in UEFI we have iPXE. We control whole firmware stack so we can modify it as needed e.g. enabled/disable options in iPXE.

What I'm trying to do is boot my hardware, it gets through boot order, kicks iPXE with known location of xcp-ng.ipxe script, which then preform automatic installation according to provided answerfile.

Is that clear enough?

pietrushnic commented 1 year ago

I guess this would be useful if merged. It would also enabled TrenchBoot use cases.

stormi commented 1 year ago

Thanks for the details. It's clear to me.

Yes, multiboot2 support was rejected by iPXE developers and would be apparently hard to maintain properly. I doubt this patch will ever be merged.

So the only options I know for EFI boot from iPXE are, as you said, grub or xen.efi. I haven't found yet how (if possible at all) to pass a custom configuration from iPXE to grub. If there are build options to enable it, we'll gladly try them on the upcoming XCP-ng 8.3 beta. Providing xen.efi is another lead and we will try to give it a chance when building XCP-ng 8.3 beta or RC ISOs.

Meanwhile, modifying grub.cfg on the ISO or on the TFTP server is the best solution. If you need the answerfile to be dependent on the machine you are installing on, then my solution above (redirect to a PHP script or such that generates the grub configuration on the fly) might work. It works for us at least.

pietrushnic commented 1 year ago

Thanks @stormi I'm definitely one to call if testing of XCP-ng 8.3 would needed.

In following months we plan to invest more 3mdeb and Dasharo Teams time in XCP-ng. This should benefit Protectli and other partners.

olivierlambert commented 1 year ago

Great news @pietrushnic eager to move forward on this together!

pietrushnic commented 1 year ago

I made some minor progress using mentioned ipxe code build in following way: https://github.com/Dasharo/edk2/pull/13/files#diff-c2c54f14df291110d0e3c29fa0764ac2f42108401c2c7edbd89ff9ad23b43b98

I'm booting Xen but it crash complaining about image: https://paste.dasharo.com/?4024a8861e962a9f#CNGo1D53J9sr1kwonV71nQ7sn815hvdEhXyjbZhccRBT

andyhhp commented 1 year ago

One misc note

(XEN) [000000e264aa46e0] Command line: dom0_max_vcpus=2 dom0_mem=2048M,max:2048M com1=115200,8n1 console=com1,vga

Use dom0_max_vcpus=1-2 instead of just simply 2. This causes less bad behaviour when e.g. the user turns SMT off, or Xen can't parse the ACPI tables and only brings up a single CPU. Overcommitting dom0 2:1 doesn't lead to a functioning system :smile:

For the main issue:

(XEN) [    5.194704] Multiple initrd candidates, picking module #1
...
(XEN) [    5.261764] ERROR: Will only load images built for the generic loader or Linux images (Not '' and '') or with PHYS32_ENTRY set

There's a longstanding iPXE bug with multiboot. When iPXE decompresses a binary, it fails to exclude it from the multiboot list, so we end up with { Xen uncompressed, module list points here ---> Xen compressed, dom0 kernel, dom0 initrd }. Xen, for bad (legacy) reasons, requires that the dom0 kernel is multiboot module 1, so it's interpreting the compressed Xen as the dom0 kernel, and the actual dom0 kernel as the initrd. Hence the warning about multiple initrds.

The giant bodge is to decompress Xen before letting iPXE work it. but if you could fix iPXE's handling of its multiboot images when decompressing intermediate ones, that would be massively preferable.

pietrushnic commented 1 year ago

@andyhhp thanks for comments.

Use dom0_max_vcpus=1-2 instead of just simply 2. This causes less bad behaviour when e.g. the user turns SMT off, or Xen can't parse the ACPI tables and only brings up a single CPU. Overcommitting dom0 2:1 doesn't lead to a functioning system :)

Ok, I guess xcp-ng documentation could be improved in that regard:

https://xcp-ng.org/docs/install.html#netinstall

The giant bodge is to decompress Xen before letting iPXE work it it. but if you could fix iPXE's handling of its multiboot images when decompressing intermediate ones, that would be massively preforable.

I don't feel very competent here, but I can try or ask someone for help. Please note I'm trying @krystian-hebel multiboot2 patches from here.

pietrushnic commented 1 year ago

@andyhhp I made some progress, please check: https://paste.dasharo.com/?1f7b7ddcb1eb4b53#FPSBGN1GKaRC288f6AF8fz5MJhR7HP5iXsYk7NSL4WK8

I'm booting xen and kernel, and finally xcp-ng installer starting, but Xen shutdown the platform. Any reason?

pietrushnic commented 1 year ago

Or maybe @stormi can say something from xcp-ng side/ ?

krystian-hebel commented 1 year ago

[ 46.136166] reboot: Restarting system means that dom0 asks for reboot, so not really an issue with Xen.

pietrushnic commented 1 year ago

The error is cannot optn /dev/sdc. I guess this is issue with my answerfile. Video on Matrix

pietrushnic commented 1 year ago

After fixing asnwerfile I can reach EULA screen, which is related to: https://github.com/xcp-ng/xcp/issues/493

stormi commented 1 year ago

Wasn't the EULA display fixed in XCP-ng 8.2.1 installation ISOs?

pietrushnic commented 1 year ago

It still popup, I guess I need something like:

<script stage="filesystem-populated" type="nfs|url">http://<SERVER_IP>/eula.sh</script>

In my answerfile ?

stormi commented 1 year ago

I'm surprised. The EULA was reformatted with short lines to fix this issue and I verified the fix did make it into XCP-ng 8.2.1 installation ISOs. What does it look like on screen?

pietrushnic commented 1 year ago

The problem is not with how EULA is presented, but that I can't make acceptance of it unattended.

IMG_20221026_162231_290

pietrushnic commented 1 year ago

XCP-ng unattended installation

Requirements

HTTP server setup

Run HTTP server

iPXE boot target hardware

pietrushnic commented 1 year ago

Screen for accepting EULA appears after installer reboot.

stormi commented 1 year ago

This screen, as well as the next one that asks for a root password to be set, is displayed if no password is defined in the answerfile. See https://github.com/xcp-ng/host-installer/blob/10.10.x-8.3/doc/answerfile.txt#L217

pietrushnic commented 1 year ago

Ok, let me try with root-password set in answer file.

pietrushnic commented 1 year ago

Looks like installer doesn't like my self-hosted repo:

192.168.1.131 - - [26/Oct/2022 17:06:35] "GET /answerfile HTTP/1.1" 200 -
192.168.1.131 - - [26/Oct/2022 17:06:46] code 404, message File not found
192.168.1.131 - - [26/Oct/2022 17:06:46] "GET /.treeinfo HTTP/1.1" 404 -
192.168.1.131 - - [26/Oct/2022 17:06:46] "GET /repodata/repomd.xml HTTP/1.1" 200 -
192.168.1.131 - - [26/Oct/2022 17:06:46] code 404, message File not found
192.168.1.131 - - [26/Oct/2022 17:06:46] "GET /update.xml HTTP/1.1" 404 -
192.168.1.131 - - [26/Oct/2022 17:06:46] "GET /repodata/repomd.xml HTTP/1.1" 200 -

I'm wget'ting this https://updates.xcp-ng.org/netinstall/8.2.1/

So I assume everything should be ok.

stormi commented 1 year ago

You probably missed the .treeinfo file.

stormi commented 1 year ago

https://xcp-ng.org/docs/develprocess.html#contents-of-the-installation-iso-image

andyhhp commented 1 year ago

@andyhhp I made some progress, please check: https://paste.dasharo.com/?1f7b7ddcb1eb4b53#FPSBGN1GKaRC288f6AF8fz5MJhR7HP5iXsYk7NSL4WK8

I'm booting xen and kernel, and finally xcp-ng installer starting, but Xen shutdown the platform. Any reason?

Seems like you've got beyond this this, but just to answer the question:

[   46.136166] reboot: Restarting system
(XEN) [   60.223973] Hardware Dom0 shutdown: rebooting machine
(XEN) [   60.236288] Resetting with ACPI MEMORY or I/O RESET_REG.

Dom0 requested a clean reboot. Xen did as instructed.

pietrushnic commented 1 year ago

@andyhhp thanks. Yes, I get through that and multiboot2 patches rebased by @krystian-hebel work well. So Dasharo will most likely include those in iPXE version we ship.

You probably missed the .treeinfo file.

@stormi thanks. Yes I was copied stuff from web server not from ISO that's why I didn't get .treeinfo file. Final version that works for me to perform unattended installation using iPXE looks as follows:

<?xml version="1.0"?>
<installation mode="fresh">
  <primary-disk>sda</primary-disk>
  <keymap>pl</keymap>
  <root-password type="plaintext">xcp-ng</root-password>
  <source type="url">https://updates.xcp-ng.org/netinstall/latest/</source>
  <admin-interface name="eth0" proto="dhcp"/>
  <timezone>Europe/Warsaw</timezone>
</installation>

Until xen.efi will apear in next XCP-ng release we will use multiboot2 approach. I guess that if in future we would like to enable TrenchBoot we would still have to go through mb2 instead of using xen.efi. So I wonder how it should be solved in long run. Maybe @andyhhp have some idea since we would rather never see mb2 merged in iPXE because of "Secure Boot".

If I will get to enable non-Dasharo machines I would probably produce and publish iPXE ISO which can be flashed to USB, so unattended installation would be available to community.

stormi commented 1 year ago

So, we investigated xen.efi, but I think it's a dead end. UEFI boot is far too rigit, and I see no way to pass information from iPXE to xen.efi dynamically. Plus, xen.efi wants all files in the same directory: kernel, initrd, which doesn't match with the current installation ISO layout and would add more challenges.

The only solution I can see remaining for anyone who wants to boot an unmodified installation media with iPXE and a custom installation configuration is by using a modified iPXE with the multiboot2 patches.

For those who wouldn't have a strict "don't modify the installation media" requirement, the best remains to modify grub's configuration directly on the installation media.