docker-archive / for-aws

92 stars 26 forks source link

Make Moby Linux AMIs ENA enabled (install AWS Elastic Network Adapter) #128

Closed adrissss closed 6 years ago

adrissss commented 6 years ago

Expected behavior

I would like to use the new M5 instances as nodes of a swarm. These instances offer significantly more CPU and network capacity for a similar price as the M4 instances. I would like Moby Linux images to work with these instance types.

Actual behavior

If I use Moby Linux 17.09.0-ce-aws1 stable AMIs with an M5 instance type, the instances won't boot, and AWS returns this error: "Launching a new EC2 instance. Status Reason: Enhanced networking with the Elastic Network Adapter (ENA) is required for the 'm5.large' instance type. Ensure that you are using an AMI that is enabled for ENA. Launching EC2 instance failed."

Information

Modification to the Moby Linux AMI would be required Steps to enable ENA are described in AWS docs: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking-ena.html

Steps to reproduce the behavior

  1. Launch a new EC2 instance
  2. select latest Moby Linux AMI
  3. Select an m5 instance type
  4. EC2 instance fails to launch
adrissss commented 6 years ago

I guess the MobyLinux AMIs must be built with LinuxKit, and I thought it shouldn't be too difficult to build the AWS ENA kernel module ( https://github.com/amzn/amzn-drivers/tree/master/kernel/linux/ena ) and integrate it into the image, so I've been trying to find some github repo where the template for MobyLinux AMI might be defined, but to no avail.

Is there a way for the community to contribute to those Docker for AWS AMIs ?

Thanks

adrissss commented 6 years ago

Hi, any updates on this? Any plans? Is it possible to have a LinuxKit template for building customised AMIs for Docker4AWS?

FrenchBen commented 6 years ago

Issue was open at the Linuxkit level: https://github.com/linuxkit/linuxkit/issues/2820

Once it's supported there, or a solution is provided, we'll make the necessary adjustments.

netflash commented 6 years ago

@FrenchBen looks like the upstream issue is fixed already. Can you guys give us any forecast about this ?

jrupp commented 6 years ago

Any movement on this now that linuxkit has said they support it?

FrenchBen commented 6 years ago

The latest changes were released in 17.12.0 stable - If that doesn't work, please let us know

adrissss commented 6 years ago

@FrenchBen Thanks for the update.

Unfortunately, even if the kernel modules are there, it seems the AMI image has not been built in a way that AWS considers it "ENA enabled". Here's what I get when trying to launch an m5.large using 041673875206/Moby Linux 17.12.0-ce-aws1 stable (ami-7abe2e03 in eu-west-1 region)

"Launching a new EC2 instance. Status Reason: Enhanced networking with the Elastic Network Adapter (ENA) is required for the 'm5.large' instance type. Ensure that you are using an AMI that is enabled for ENA. Launching EC2 instance failed."

As explained in http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking-ena.html the enaSupport AMI attribute has to be enabled, which is not the case:

aws ec2 describe-images --image-id ami-7abe2e03 --query 'Images[].EnaSupport'
[]

If the kernel modules are enabled, it looks like we're very close and it's just a matter of adding that attribute to the AMI when it's built.

adrissss commented 6 years ago

Any news? Moby Linux 17.12-ce-aws1 is still incompatible with m5 instances ... Thanks

FrenchBen commented 6 years ago

@adrissss looks like the detection is on the AMI registration - I missed that part of the AMI build process and will make the change for the next release. Thanks for the help.

adrissss commented 6 years ago

@FrenchBen Great to hear that, thanks !

netflash commented 6 years ago

@FrenchBen and when the next release is due ?

FrenchBen commented 6 years ago

@netflash We'll follow the release of Docker - I was doing some QA on the Edge release of 18.01 and will rebuild to have this change included as part of the release.

jrupp commented 6 years ago

@netflash The next stable release will be in March.

Can we pretty please have a 17.12-ce-aws2 build? So that ENA can be enabled in that one, and the 17.12-ce-aws1 build can be left alone for those already on it? Please?!

FrenchBen commented 6 years ago

@jrupp Let's confirm that it works as expected on 18.01 and go from there?

jrupp commented 6 years ago

@FrenchBen Sounds good. 👍

lachlancooper commented 6 years ago

I noticed today that the 17.12.1-ce-aws1 images are ENA-enabled, so I tried deploying a cluster using c5.large instances. While Moby was able to start, the instances never passed reachability checks. In the system log I can see warnings about Network unreachable but also failures in Configuring host block device:

https://gist.github.com/lachlancooper/c6a64f9bb2c661dec93e59314e9ed676

It's possible that the ENA or NVMe drivers are not being loaded correctly, but without explicit log messages I can't tell.

I know this isn't supported yet, just thought I'd give a heads-up in case the log is helpful for debugging.

FrenchBen commented 6 years ago

Thanks for the help @lachlancooper in debugging this. I'll pass it on to the kernel team to see if they can debug.

kinghuang commented 6 years ago

Does this also affect r4 instance types? I haven't been able to get them to work, and they show ✗ No network connection in the console logs. I think they also use ENA?

https://gist.github.com/kinghuang/cb7f3811e6e1c0d53e6edc73ed8cb2e4

FrenchBen commented 6 years ago

@kinghuang there's a module config that needs to be changed. We'll be building a new AMI for it to work.

kinghuang commented 6 years ago

@FrenchBen Great! Will the module config changes make it for the 18.03 images?

FrenchBen commented 6 years ago

@kinghuang Since 18.03 is still an RC, then the final build will include it.

kinghuang commented 6 years ago

Just successfully launched an M5 instance with the 18.03.0 AMI (ami-ccf668b4). Looks promising!

FrenchBen commented 6 years ago

Closing for now, feel free to comment if it needs to be opened back up