SeldonIO / seldon-core

An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models
https://www.seldon.io/tech/products/core/
Other
4.38k stars 831 forks source link

seldon orchestrator is not able to evenly distribute load between replicas #4748

Open saeid93 opened 1 year ago

saeid93 commented 1 year ago

Following community slack discussion. Load is not evenly distributed among services in gRPC Istio installation. All the load will go only through one container. The initial guess was that it is a problem with Istio or Ambassador installation which turns out not to be the case. Further investigation shows that it is the problem of the service orchestrator, removing the service orchestrator using seldon no engine option will result in a completely even distribution among multiple replicas. Therefore the bug is in the service orchestrator.

ukclivecox commented 1 year ago

Can you provide the example you are testing with.

saeid93 commented 1 year ago

@cliveseldon Sure, find the gist here https://gist.github.com/saeid93/dc52e38387eaea4d58dfeb9a2cb689b3

saeid93 commented 1 year ago

@cliveseldon Just adding an update on this, I made a simple test orchestrator that is basically an MLServer container that routes the traffic to the chained models to mimic the svc orchestrator logic of V1. This minimal orchestrator was able to route traffic to multiple replicas of pods (models) at each step when using Istio sidecars. Therefore I think there might be some issue in the original orchestrater regarding how it makes the grpc connections. I'll be happy to do further root cause analysis of this. Please let me know if you have any specific scenarios that could help narrow down the issue.