aws / karpenter-provider-aws

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.
https://karpenter.sh
Apache License 2.0
6.29k stars 869 forks source link

Precache images using snapshots #4725

Open runningman84 opened 10 months ago

runningman84 commented 10 months ago

Description

What problem are you trying to solve? For some ml usecases we are dealing with quite big docker images. It would be cool to somehow put the images in an extra volume and advice karpenter to attach a volume based on some snapshot containing all these big images. This would greatly reduce data transfer costs.

How important is this feature to you?

tzneal commented 10 months ago

It would take some work, but you should be able to do this now.

1) Create a provisioner that launches a node with an extra volume using block device mappings 2) Use custom user data to mount that volume to where container images are stored 3) Pull any desired images to the node 4) Create a snapshot of the volume 5) Update the block device mapping to specify the snapshotID

FernandoMiguel commented 10 months ago

I have a similar discussion topic for bottlerocket https://github.com/bottlerocket-os/bottlerocket/discussions/3477

my concern with this approach is that there will be container data, not only the container image in the snapshot

tzneal commented 10 months ago

my concern with this approach is that there will be container data, not only the container image in the snapshot

You could do some manual cleanup between 3 & 4, but would need to detach the instance from Karpenter and drain it. There's a draft upstream KEP for splitting the readonly/readwrite image filesystem that would make this easier at https://github.com/kubernetes/enhancements/pull/4198

FernandoMiguel commented 10 months ago

thanks for that link @tzneal . i'll subscribe to it.