Open saeid93 opened 1 year ago
Can you provide the example you are testing with.
@cliveseldon Sure, find the gist here https://gist.github.com/saeid93/dc52e38387eaea4d58dfeb9a2cb689b3
@cliveseldon Just adding an update on this, I made a simple test orchestrator that is basically an MLServer container that routes the traffic to the chained models to mimic the svc orchestrator logic of V1. This minimal orchestrator was able to route traffic to multiple replicas of pods (models) at each step when using Istio sidecars. Therefore I think there might be some issue in the original orchestrater regarding how it makes the grpc connections. I'll be happy to do further root cause analysis of this. Please let me know if you have any specific scenarios that could help narrow down the issue.
Following community slack discussion. Load is not evenly distributed among services in gRPC Istio installation. All the load will go only through one container. The initial guess was that it is a problem with Istio or Ambassador installation which turns out not to be the case. Further investigation shows that it is the problem of the service orchestrator, removing the service orchestrator using seldon no engine option will result in a completely even distribution among multiple replicas. Therefore the bug is in the service orchestrator.