Open danielfdickinson opened 2 years ago
Hello @danielfdickinson,
Another option, as Vultr supports iPXE booting, is to netboot Alpine with an iPXE script. The ssh_key
and other options may be provided to the iPXE script to help with initial provisioning/access.
This should be nicer to work with than your proposed workaround, which while functional as you've mentioned does entail more process in managing the image lifecycle with Alpine releases.
If you would prefer to continue with this feature request, please note we will need to review our current roadmap/timelines before we can consider this. Pull requests are always welcome!
Hello @Oogy,
Thank you for your response. I looked the the iPXE and netboot links you provided, and I have done netboot on a local network before. I think netboot needs more 'moving parts' than a Packer boot_command
and even without boot_command
, given that I already have a core image (i.e. that I just need to upload the snapshot and can provision via Packer using SSH) I think netboot would involve a lot more work.
I will add adding the boot_command
capability to the Vultr plugin to my 'to do' list, although at this point I make no more promises than you :grin:
Hopefully I am able to get to it sooner than later and will have a PR for you at some point.
Thank your for the suggestion. If I was starting from scratch it would be more likely be a worthwhile route for me, so others may benefit from the info.
I decided to take another look at the iPXE option and found a Libvirt iPXE boot guide that let me know that there were less required moving parts than I thought. My question for the Vultr Packer plugin (this repo) is whether
script_id (string) - If you've not selected a 'custom' (OS 159) operating system, this can be the id of a startup script to execute on boot. See Startup Script.
means I would not be able to use script_id
(pointed to an iPXE script) with Alpine Linux because one needs to specify custom os and an iso_id
for Alpine (since Alpine is not in the main list of Vultr OSes).
If I could specify an iPXE script to boot Alpine Linux, then I don't wouldn't need boot_command
, and in fact using iPXE would be preferred because there wouldn't be the requirement for waits and slow simulated keyboard input, so creating the image would be must speedier.
If it's already possible to use script_id
with Packer with Alpine Linux, then I'll close this issue. If not, I will create a new issue requesting that and close this issue (if it is something that is reasonably likely to happen).
Hello @danielfdickinson,
I'm glad to see you've revisited the idea, iPXE is quite nice to work with IMO.
It appears the description for script_id
may need to be amended, as written I believe that is only valid for a startup script of type boot
.
There are 2 types of scripts supported by Vultr, boot
and pxe
scripts. Both are specified via the script_id
.
So in the case of PXE booting, setting os_id
to custom(id 159), and passing your pxe
script ID to script_id
should work. I have done so in the past myself. If you have any issues please let us know.
Additionally, as you are PXE booting you do not need to specify the iso_id
or even have the ISO on your account since there is no ISO involved.
@Oogy iPXE is now working for me :tada:. The description for script_id
does need to be update as using a 'custom os' (159) and a PXE script worked (with one caveat).
It seems though that the kernel and initrd URLs cannot be HTTPS even though iPXE reports as having HTTPS support (HTTPS works for me with iPXE under libvirt though, so it's probably a version or build issue, possibly because my instance uses Let's Encrypt for SSL certificates).
@danielfdickinson glad to hear it. On Monday I'll open up an issue for updating the docs as well as look into the HTTPS problem. Last I'd experimented with this I was netbooting Flatcar Linux using HTTPS URLs so I'm pretty sure that should work.
If you could share any errors or console screenshots that'd be a great help.
Would you like the netboot screenshots here, or is there a better place (like a Vultr ticket)?
I'll also include the applicable iPXE scripts in the info.
@danielfdickinson here is fine 👍
Here is a screenshot when kernel is https:
and here is the iPXE script:
#!ipxe
set base-url https://ipxe-boot.wildtechgarden.ca
kernel ${base-url}/boot-3.16/vmlinuz-virt console=tty0 modules=loop,squashfs quiet nomodeset alpine_repo=https://mirror.csclub.uwaterloo.ca/alpine/v3.16/main modloop=https://ipxe-boot.wildtechgarden.ca/boot-3.16/modloop-virt ssh_key="ssh-ed25519 AAAtheykey... comment@host"
initrd ${base-url}/boot-3.16/initramfs-virt
boot
And as mentioned, it works if I change
set base-url https://ipxe-boot.wildtechgarden.ca
to
set base-url http://ipxe-boot.wildtechgarden.ca
Hello @danielfdickinson,
I've taken some time to look at this and I think the issue may be that the Common Name in your LE cert is different from the domain in the base-url. I have no proof of this as the iPXE errors are not terribly helpful and we cannot enable debug mode(as this requires a separate build of the iPXE binary), but it is the only notable difference I can see between your cert and my test which used https://boot.netboot.xyz.
The CA certs for your LE cert are cross-signed by the iPXE CA cert and so that should not be the issue. Could you perhaps try using a new LE certificate with the Common Name of wildtechgarden.ca
and SANs wildtechgarden.ca
and *.wildtechgarden.ca
?
Hey @Oogy,
Thank you for looking at this. Changing the CN didn't solve the issue, but it did get me looking at things like server logs and DNS records and I realized that I had ipxe-boot.wildtechgarden.ca as a CNAME and the CNAME was not the commonName on the cert. I've switched ipxe-boot to A and AAAA records (since I did switch the CN to ipxe-boot...) and once the TTLs clear out, I'll give it another go, but I think you gave me the right idea where to look (names of CN vs DNS name). Will let you know.
I have confirmation that it is iPXE rejecting the connection and not on the server side -- with lighttpd I got (mod_openssl.c.3213) SSL: -1 5 0: No error information
and when I increase the minimum cipher level the lighttpd logs change to (mod_openssl.c.3249) SSL: 1 error:1417A0C1:SSL routines:tls_post_process_client_hello: no shared cipher (ip-address)
and the iPXE error message changes to "Error not permitted".
I wonder if the version of iPXE is too old and it doesn't like the cross-signed certificate (i.e. whether the version of iPXE was before LE dropped the (DigiCert?) cross-sign).
I can't test the wildtechgarden.ca
CN with only SANs wildtechgarden.ca
and *.wildtechgarden.ca
without breaking other sites on this server, but have tried with CN === reverse DNS name === DNS on A record and AAAA record. (Specifically radicale-lighttpd-01.wildtechgarden.ca) which is also in the list of SAN.
It's a long SAN list though and maybe that is a problem. I will have to try again with a system dedicated to testing this, so I don't have to worry about the other sites on the server.
I found https://github.com/ipxe/ipxe/pull/116 which adds support fragmented handshakes (e.g. due to large certificate chains). Based on that I think it is highly probably the number SANs is the problem in that it causes fragmentation. I unfortunately don't have a DNS provider compatible with a DNS challenge to do a wildcard certifiate, so unless the workaround described in the PR works I might be out of luck until I dedicate a host to serving the iPXE stuff (or at least keeping the SAN list small).
:fireworks: :partying_face: Got it!
The PR 116 for iPXE mentioned above showed me the way. I needed to use --preferrred-chain "ISRG Root X1"
for me LE certificate, as described in one of the comments: https://github.com/ipxe/ipxe/pull/116#issuecomment-862709507
I also needed to use slightly less secure lighttpd settings than ideal (but which are the current defaults for compatibility reasons).
ssl.openssl.ssl-conf-cmd = (
"MinProtocol" => "TLSv1.2",
"Options" => "ServerPreference",
"CipherString" => "HIGH"
#"Options" => "-ServerPreference"
#"CipherString" => "EECDH+AESGCM:AES256+EECDH:CHACHA20"
)
Although since I've removed the higher security cipher settings I could just omit ssl.openssl.ssl-conf-cmd
altogether.
Shall I close this?
@danielfdickinson I'd like some more details, please, as lighttpd has announced plans to change TLS defaults to be stricter in a release some time in Jan 2023. What were the client limitations? Most frequently in my experience, "MinProtocol" => "TLSv1.2"
is compatible with the vast majority of clients. Are you sure that the client could not support "CipherString" => "EECDH+AESGCM:AES256+EECDH:CHACHA20"
when "MinProtocol" => "TLSv1.2"
?
I unfortunately don't have a DNS provider compatible with a DNS challenge to do a wildcard certifiate (sic)
lighttpd supports Let's Encrypt bootstrap using TLS-ALPN-01 verification challenge https://wiki.lighttpd.net/HowToSimpleSSL
@gstrauss
Are you sure that the client could not support
"CipherString" => "EECDH+AESGCM:AES256+EECDH:CHACHA20"
when"MinProtocol" => "TLSv1.2"
?
Yes. I get (mod_openssl.c.3249) SSL: 1 error:1417A0C1:SSL routines:tls_post_process_client_hello: no shared cipher (ip-address)
if I use
ssl.openssl.ssl-conf-cmd = (
"MinProtocol" => "TLSv1.2",
"Options" => "-ServerPreference"
"CipherString" => "EECDH+AESGCM:AES256+EECDH:CHACHA20"
)
In addition at the iPXE crypto docs I see the following table:
Protocol versions | TLSv1.0 TLSv1.1 TLSv1.2 |
---|---|
Public Key Algorithm | RSA |
Block cipher algorithmss | AES-128-CBC AES-256-CBC |
Hash algorithms | MD5 SHA-1 SHA-224 SHA-256 SHA-384 SHA-512 SHA-512/224 SHA-512/256 |
The exact list of supported cipher suites is RSA_WITH_AES_256_CBC_SHA256, RSA_WITH_AES_128_CBC_SHA256, RSA_WITH_AES_256_CBC_SHA, and RSA_WITH_AES_128_CBC_SHA.
The iPXE github repo hasn't had a release in two years (sometime in 2020), and I don't see any crypto-related changes in the repo in that period of time.
@gstrauss
lighttpd supports Let's Encrypt bootstrap using TLS-ALPN-01 verification challenge https://wiki.lighttpd.net/HowToSimpleSSL
Nice. But not quite solving the wildcard (*.example.com
) thing, which is what I was commenting on. AFAICT the TLS-ALPN-01 requires a matching DNS entry (so specific name, not a wildcard, unless I'm misreading the docs).
@danielfdickinson thank you for the details. New releases of lighttpd on or after Jan 2023 will have stricter TLS defaults and "CipherString" will need to be manually configured in lighttpd.conf to include one or more of those ciphers to work with iPXE: AES256-SHA256:AES128-SHA256:AES256-SHA:AES128-SHA
. If you must enable the older ciphers, I'd recommend adding only AES256-SHA256
and seeing if that works, e.g. "CipherString" => "EECDH+AESGCM:AES256+EECDH:CHACHA20:AES256-SHA256"
and also "Options" => "+ServerPreference"
to avoid downgrade attacks to the weakest cipher option.
Also, you are correct that TLS-ALPN-01 verification challenge is not available for validating wildcard certs. (https://letsencrypt.org/docs/challenge-types/)
For your information:
Is your feature request related to a problem? Please describe.
The Alpine Linux ISO does not include cloud-init which means initial setup has to be done over VNC. With the Packer Qemu build there is a
boot_command
option that allows to interact with the instance over VNC in order to do 'just enough' to SSH in.I have utilized this ability to create a QCOW2 image from the Alpine ISO that includes cloud init in a public repo.
The documentation for the
boot_command
capability can be found in the Packer Documentation (see the 'Boot Configuration' section).The source code for the qemu builder is at: https://github.com/hashicorp/packer-plugin-qemu/tree/main
Describe the solution you'd like
A similar
boot_command
capability for the Vultr plugin that allows controlling the instance via VNC in order to enable SSH access (after which regular provisioning can be be used).Describe alternatives you've considered
Since the goal is automation doing this manually as described in the Vultr docs for Alpine Linux doesn't solve the problem, and would require doing this for every new release of Alpine Linux.
Another option would be to be able to upload a QCOW2 or RAW boot image rather than only an ISO (such as can be done with OpenStack). AIUI the snapshots option is not just a disk image but a whole VM image which means uploading a QCOW2 or RAW generated using the public repo I created, above, is not currently an option with Vultr.
EDIT: I was able to upload (but have not yet tested) a RAW image generated use the repo I mentioned from a web hosting instance I have (it would be helpful to be able to upload directly from my local machine, but that's a separate issue), so it looks like there may be a workaround for now.
EDIT #2: While it was possible to upload the image, it fails to boot an instance. So currently there is no automation-friendly workaround.
EDIT #3: Mea culpa, the workaround from the first edit works; I had an error in my Packer scripts that didn't use the snapshot properly. So there is a workround for now. Example repository at https://gitlab.com/danielfdickinson/alpine-two-stage-packer-for-vultr
From what I have read of the Alpine Docs, Wiki, and mailing list, "cloud-init" is considered too heavy and efforts are focused on tiny cloud init, and I am new to Alpine (and have not yet posted to the mailing list), so it seems that requesting adding a 'cloud-init' image to the Alpine releases would be a non-starter.
In addition this would enable more distros to be prepared for use on Vultr from their main distribution ISO.