aws-samples / comfyui-on-eks

ComfyUI on AWS
MIT No Attribution
92 stars 15 forks source link

How’s the cold start time of EC2 with instance store #9

Open Weixuanf opened 1 month ago

Weixuanf commented 1 month ago

Cuz I want to scale down to 0 instance when there’s no request.

How long does it take to cold start EC2 from 0 instance? I think instance store EC2 is slower to boot than EBS backed EC2 instance?

and downloading models from S3 to instance store takes extra time. How’s the downloading speed look like from s3 to instance store?

Thanks for this amazing template!

Shellmode commented 1 month ago

It might take several minutes to ten minutes to cold start, here are some steps:

  1. Karpenter finds provisionable pod(s), starts to spin up EC2(negligible time consumption, less than 1s)
  2. EC2 initialize, run user-data script(sync all models on S3 to instance store) defined in Karpenter custom resource EC2NodeClass, start kubelet and other stuffs to get node ready. (It may take a few minutes, depends on how much you need to sync from S3)
  3. Pull image from ECR(It may also take like 5mins, depends on how large the image is)

Actually all GPU instance types (like g4dn/g5/g6) have instance store, boot time is the same whether you use instance store or not.

The solution uses instance store to improve models loading and switching (inside ComfyUI) performance.

It's a tradeoff, we spend more time setting up the environment to get better performance.

Weixuanf commented 1 month ago

thanks very much for your reply. I want to run serverless comfyui servers that scale down to 0 when no requests to save time. So cold start time is very important for me. I hope to get < 5s cold start time (excluding comfyui boot time itself). I'm thinking EC2 + EBS and stopping/starting the EC2 server to achieve better cold start times than using auto scale group. If you have other suggestions, please let me know!

Actually all GPU instance types (like g4dn/g5/g6) have instance store, boot time is the same whether you use instance store or not.

oh so even if I use EKS, there will still be instance store in it?

Shellmode commented 1 month ago

Yes, g4dn & g5 & g6 all have instance store, refer to Amazon EC2 instance store, you can use it or just ignore it (it's free).

EC2 with EBS will have less boot time, because there's no image pulling and model syncing. But you need to handle EC2 scaling in/out yourself. Besides that, loading models from EBS to GPU memory might take more time than loading from instance store.

PeterTF656 commented 1 day ago

thanks very much for your reply. I want to run serverless comfyui servers that scale down to 0 when no requests to save time. So cold start time is very important for me. I hope to get < 5s cold start time (excluding comfyui boot time itself). I'm thinking EC2 + EBS and stopping/starting the EC2 server to achieve better cold start times than using auto scale group. If you have other suggestions, please let me know!

Actually all GPU instance types (like g4dn/g5/g6) have instance store, boot time is the same whether you use instance store or not.

oh so even if I use EKS, there will still be instance store in it?

Have you thought about mounting EFS to your instances?