nebari-dev / nebari

🪴 Nebari - your open source data science platform
https://nebari.dev
BSD 3-Clause "New" or "Revised" License
283 stars 93 forks source link

Add option to use custom AMI for AWS EKS nodes #2604

Closed joneszc closed 2 months ago

joneszc commented 4 months ago

Feature description

Enable ability to specify a custom Amazon Machine Image to utilize for EKS cluster nodes in lieu of the default image. This feature would require an aws_launch_template terraform resource, possibly dependent on or incorporated with #2603, to run the /etc/eks/bootsrap.sh command as necessary when the ami_type is "CUSTOM". When specifying a custom AMI ID, an additional switch would be necessary to ensure that ami_type "CUSTOM" replaces "AL2_x86_64_GPU" or "AL2_x86_64", and onus is on the user to ensure the custom AMI is or isn't GPU-enabled.

Value and/or benefit

Nebari users would have the option (e.g. amazon_web_services.node_groups.custom_ami)to utilize customized/optimized ec2 AMI to accommodate customer requirements to ensure networking/security/performance compliance.

For example:

image

Anything else?

No response

viniciusdc commented 4 months ago

Hi @joneszc, thanks for opening the issue, That was an amazing catch; I completely agree that there should be an optional setting in the node_group settings in our nebari-config.

I also don't see a problem with compatibility as the majority of the users would use the default AMI options by default, and this config while present will only be modified by users "looking for it", though appropriate docs will be required to guide users to avoid silly mistakes like per-zone AMIs etc..

joneszc commented 4 months ago

Hello @viniciusdc

The option to specify an ami id could definitely cause some confusion to users not wanting to get into the weeds of EKS.
There is also added risk of users experiencing nodes failing to join the cluster if they are required to manually input or tamper with the /etc/eks/bootstrap.sh command. I feel it would be appropriate to hard code the bootstrap.sh command into the terraform, with proper logic/directives for triggering, and spare the user the responsibility of formulating the command and of potentially troubleshooting faulty nodes. Perhaps, down the road it would be beneficial as a separate use case to enable specific pointed overrides for arguments like --kubelet-extra-args. I've combined testing of the custom ami option and #2603 at this fork

tylergraff commented 3 months ago

@viniciusdc will review this as part of the PR linked above.

tylergraff commented 2 months ago

This feature was introduced in https://github.com/nebari-dev/nebari/pull/2668