aws / aws-nitro-enclaves-cli

Tooling for Nitro Enclave Management
Apache License 2.0
121 stars 81 forks source link

update vsock driver #333

Open open-contracts opened 2 years ago

open-contracts commented 2 years ago

For my use case, it is important that the vsock channel preserves packet boundaries. Vsock was updated to support this (https://lwn.net/Articles/846628/), which became part of the linux kernel since June (https://github.com/torvalds/linux/commit/ced7b713711fdd8f99d8d04dc53451441d194c60).

How can I best "update" the vsock (driver?) that my ec2 and enclave use to the most recent version? I'm currently running the "AWS Nitro Enclaves Developer AMI v1.01" which runs the Linux kernel "Linux 4.14.256-197.484.amzn2.x86_64 x86_64", but the vsock update is only contained in Linux Kernel v5.14 and onwards, so it should be available in Fedora 34.

Would you recommend switching to a recent enough Fedora 34 AMI, and following https://github.com/aws/aws-nitro-enclaves-cli/blob/main/docs/fedora_34_how_to_install_nitro_cli_from_github_sources.md to get the Nitro CLI running on there? or is there an easier solution?

Thanks!

Edit: Tried to follow the tutorial with the Fedora-Cloud-Base-35-1.2.x86_64-hvm-us-east-2-gp2-0 AMI, which comes with the 5.15.7-200.fc35.x86_64 kernel. Everything works until I want to start an .eif. It gets stuck at Start allocating memory.... If I cancel and try again, I get an ioctl error - until I sudo reboot, in which case it gets stuck again. Edit2: Same thing with the Fedora-Cloud-Base-34-20211214.0.x86_64-hvm-us-east-2-gp2-0 AMI.

petreeftime commented 2 years ago

Hi, @open-contracts. Currently we don't offer support for kernel 5.14 or later in either the parent VM or the enclave VM, so we haven't tested this feature yet, but our vsock device does not expose this feature, so even with an up to date kernel it very likely won't work unfortunately. I will give it a test, just in case, and talk to the team about adding support.

Do you have logs for the ioctl error? Maybe I can also look into those?

andraprs commented 2 years ago

I sent a patch to lkml for the blocking issue seen during enclave memory setup (that's for v5.15+ Linux kernels): https://lore.kernel.org/lkml/20211218103525.26739-1-andraprs@amazon.com/.

open-contracts commented 2 years ago

Thank you so much @andraprs! Really appreciate the quick support. Is there any way I could try out an EC2 with your patch? Could you maybe share e.g. an AMI?

And @petreeftime, thank you for the explanation! Would love to understand it in more detail. I was assuming that (very roughly) the nitro-cli builds the .eif by "extending" the source docker image with the EC2's kernel files (which includes the vsock driver of the EC2). So I thought, if the vsock driver of the EC2 supports some feature, it would be available in the enclave as well. Is that wrong? Or is that true in general, but the problem here is that there are further restrictions placed on the enclave <-> EC2 communication, just because of how Nitro enclaves are built?

Also, @petreeftime, would the ioctl logs still be useful or did @andraprs patch already fix the problem?

andraprs commented 2 years ago

Thank you so much @andraprs! Really appreciate the quick support. Is there any way I could try out an EC2 with your patch? Could you maybe share e.g. an AMI?

FYI, I sent out a second version of the patch, based on the received feedback for v1: https://lore.kernel.org/lkml/20211220195856.6549-1-andraprs@amazon.com/.

One option, till the patch is merged in the upstream Linux kernel, can be to clone this GitHub repository, aws-nitro-enclaves-cli. At this path => https://github.com/aws/aws-nitro-enclaves-cli/tree/main/drivers/virt/nitro_enclaves, the out-of-tree Nitro Enclaves kernel driver codebase can be found.

You can apply the patch mentioned above and then build the driver. Further on, remove the already loaded NE kernel driver => "$ sudo rmmod nitro_enclaves", and load the recently built version that includes the patch => "$ sudo insmod nitro_enclaves.ko". This is needed for v5.15+ Linux kernels.

Another option would be to find an older AMI, with the v5.14 Linux kernel; that will work without this patch.

petreeftime commented 2 years ago

The functionality of vsock is split between the vsock driver, which is in the instance and enclave (separate drivers), and the vsock device, which is in the virtual machine monitors, developed by us. To allow the vsock driver to support SEQPACKET, it seems like some changes are also required in the VMM, to advertise this capability and ensure it actually does the right thing and does not split the packets for any reason. We have to take a look at what needs to be done in the VMM to support this correctly.

andraprs commented 2 years ago

Thank you so much @andraprs! Really appreciate the quick support. Is there any way I could try out an EC2 with your patch? Could you maybe share e.g. an AMI?

FYI, I sent out a second version of the patch, based on the received feedback for v1: https://lore.kernel.org/lkml/20211220195856.6549-1-andraprs@amazon.com/.

One option, till the patch is merged in the upstream Linux kernel, can be to clone this GitHub repository, aws-nitro-enclaves-cli. At this path => https://github.com/aws/aws-nitro-enclaves-cli/tree/main/drivers/virt/nitro_enclaves, the out-of-tree Nitro Enclaves kernel driver codebase can be found.

You can apply the patch mentioned above and then build the driver. Further on, remove the already loaded NE kernel driver => "$ sudo rmmod nitro_enclaves", and load the recently built version that includes the patch => "$ sudo insmod nitro_enclaves.ko". This is needed for v5.15+ Linux kernels.

Another option would be to find an older AMI, with the v5.14 Linux kernel; that will work without this patch.

The fix is available on Fedora 35 starting with kernel-5.15.13-200.fc35.