Open kevinnowland opened 1 year ago
Maybe one solution is to allow model autoscaling to be switched off, plus the ability for models to be defined to be locked to all server replicas so if the server autoscales all models are added to it. Scale down scenarios are also handled this way. So essentially that is delegating auto-scaling to server HPA/KEDA and is more akin to Seldon Core V1 except multi-models can also be scaled this way? New model joiners would need to be added to all replicas. @sakoush
this is now addressed in : https://github.com/SeldonIO/seldon-core/pull/5935
slack conversation
What is the behavior of seldon core v2 in the following scenario?
Related feature requests:
Thanks for your help!