awslabs / multi-model-server

Multi Model Server is a tool for serving neural net models for inference
Apache License 2.0
994 stars 231 forks source link

How to achieve autoscaling when running MMS on a fargate? #970

Open sunilkumarmohanty opened 3 years ago

sunilkumarmohanty commented 3 years ago

Hi,

I would like to autoscale my model workers based on the request they receive. I am unable to locate any documentation on that. Could somebody please help me configure autoscaling.

I am running the MMS on fargate and I have autoscaling enabled at task level based on CPU. However, I am clueless on how to manage scaling of model workers inside a task.

Br, Sunil

sandruskyi commented 2 years ago

I'm facing the same issue!