firecracker-microvm / firecracker-go-sdk

An SDK in Go for the Firecracker microVM API
Apache License 2.0
485 stars 123 forks source link

Unknown kernel command line parameters with CNI and Linux 6.1 Kernel #516

Closed alexellis closed 4 months ago

alexellis commented 10 months ago

The Firecracker team recently added a Linux Kernel 6.1 configuration for guest VMs.

When switching to this, and using CNI for networking and IP allocation, the IP address is injected into KernelArgs part of the firecracker.Config struct.

The problem appears to be that one or more of the previously valid configuration items or approaches for setting IPs is no longer valid in the 6.1 Kernel.

[    0.000000] Kernel command line: i8042.noaux i8042.nomux init=/sbin/init root=/dev/vda random.trust_cpu=on console=ttyS0 reboot=k pci=off ip=192.168.128.26::192.168.128.1:255.255.255.0:::off::: i8042.dumbkbd panic=1 acpi=off i8042.nopnp root=/dev/vda rw earlycon=uart,mmio,0x40003000
[    0.000000] Unknown kernel command line parameters "pci=off ip=192.168.128.26::192.168.128.1:255.255.255.0:::off::: acpi=off", will be passed to user space.

The result is that networking is not enabled or working as it was in a 5.11 Kernel with the same use of this Go SDK and the same CNI configuration.

@richardcase also ran into an identical issue whilst trying to write a new tool with this SDK and CNI.

CNI is important for simple IP management and proper namespacing of networks etc.

richardcase commented 10 months ago

As @alexellis mentions, i ran into this using:

For the time being i have used the 5.10 kernel config instead.

alexellis commented 10 months ago

It seems like pci= ip= and acpi= are all valid within the Kernel documentation for 6.1: https://www.kernel.org/doc/html/v6.1/admin-guide/kernel-parameters.html

Perhaps there's an issue with the format of the string injected by CNI? I.e. ip=192.168.128.26::192.168.128.1:255.255.255.0:::off:::

Format: https://www.kernel.org/doc/html/v6.1/admin-guide/nfs/nfsroot.html

ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>:<dns0-ip>:<dns1-ip>:<ntp0-ip>

The 5.11 docs show the same string:

ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>:<dns0-ip>:<dns1-ip>:<ntp0-ip>

So perhaps it's something else that the maintainers can advise on?

alexellis commented 10 months ago

It's been a couple of weeks now, so I wanted to tag a maintainer/contributor for suggestions/input.

@fangn2 what would be your thoughts?

@kzys are you at Fly now? I don't think Fly uses this Go SDK, but have you tried a 6.1 Kernel, have you had any similar issues?

lbogdan commented 10 months ago

This seems to be because CONFIG_IP_PNP is disabled in the 6.1 config: https://github.com/firecracker-microvm/firecracker/blob/60c3b14a998eb1db56403df34853fcdece6efc24/resources/guest_configs/microvm-kernel-ci-x86_64-6.1.config#L893 (see also this Slack thread).

The weird thing is it is also disabled in the 5.10 one: https://github.com/firecracker-microvm/firecracker/blob/60c3b14a998eb1db56403df34853fcdece6efc24/resources/guest_configs/microvm-kernel-ci-x86_64-5.10.config#L893, so I'm not sure how that works, but 6.1 doesn't (both with Firecracker default configs).

alexellis commented 10 months ago

From looking at git blame, and our configs saved from last year, I can see that CONFIG_IP_PNP was actually on until 2 months ago when @pb8o submitted a PR called "trim configurations"

https://github.com/firecracker-microvm/firecracker/commit/1c07d2dac3953915d41b0d6a7a8555d92cbc831d

Disabled some options that we don't seem to need for our integration
tests. There's still options we can disable, but this already brings
down the kernel size from

Perhaps the integration tests in this repo were skipped, or do not exercise CNI and verify that it works?

It also turns off important features for getting containers to work within Firecracker like CONFIG_NF_NAT

A larger issue is the huge heavy lift that exists for all Firecracker users who want to use containers/K8s within their microVMs, we have so many options to add in to every different Kernel version to make it usable.

pb8o commented 10 months ago

Hi! Thanks for reporting this issue. In the Firecracker team, we weren't aware that these kernels are used anywhere else besides Firecracker's CI and I went ahead with removing everything that looked superfluous.

We will look into this issue, but keep in mind that these guest kernel config are not recommended, but are rather just provided as examples.

utibeabasi6 commented 8 months ago

Hey @pb8o do you have a recommended guest config? Probably one that would work for a prod environment

pb8o commented 7 months ago

@utibeabasi6 We don't currently have a recommended guest config. You can use the ones in the repo as a starting point and add anything you need. I will add back that option as that seems like something that is useful for our CI too.

alexellis commented 7 months ago

Thank you for the response @pb8o - if you've added this and can link the commit or PR, I'll get this closed?

pb8o commented 7 months ago

I haven't added it back yet. Is CONFIG_IP_PNP=y the only option needed or do we also need CONFIG_IP_PNP_DHCP=y?

pb8o commented 4 months ago

Hi, the changes got merged in firecracker-microvm/firecracker#4503, and we recently created all the artifacts, so the new kernels should be effective now. I think this can be resolved. Thanks!

alexellis commented 4 months ago

Thanks for the message @pb8o I'll close this now.

alexellis commented 4 months ago

On a related note, do you have a rough timeline for when the latest LTS Kernel may be available with a tested guest config?

pb8o commented 4 months ago

We roughly wait for when a new kernel is released in Amazon Linux, and then we add support some time after. We don't have an established timeline at the moment.