Open stephen37 opened 1 year ago
Hi @stephen37 @cliveseldon Facing a similar issue Old pods are not getting deleted when new pods are up.. Is this issue resolved in later versions of seldon??
Hey,
The solution we found is to always define the number of replicas in the Seldon Deployment. That way the pods are always updated
We have a very similar issue with the same logs, where somehow the controller keeps saying that the deployments are the same and then tries to reconcile. But actually nothing seems to happen and the model seems fine. Though argocd says the deployment is stuck. I don't fully understand what is going on, the deployment is similar to all our other deployments. We have set the number of replicas to a fixed number but it still happens. Any thoughts on why the operator might think there are duplicate deployments and services?
When I describe the SeldonDeployment I see this:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Updated 3m47s (x1451532 over 20h) seldon-controller-manager Updated SeldonDeployment "xxx"
edit: I did find the issue and reported it here: https://github.com/SeldonIO/seldon-core/issues/5435
Describe the bug
When deployment a new SeldonDeployment and expecting pods to be restarted with the new version, not all pods are created with the new version and the Kube-controller-manager complains about
I haven't checked with Seldon-Core V2, those are on Seldon-core V1 and it has been happening since Seldon Core 1.16.0
To reproduce
kube-controller-manager
complains with the error mentioned aboveExpected behaviour
All pods should be updated with the latest version of the Docker image and it's not the case
They should all be
21m
old but for some of them, the deployment sync has errored.Environment
Cloud Provider: EKS
Kubernetes Cluster Version: 1.24/ 1.25
Deployed Seldon System: 1.16.0
Images of your model: [Output of:
kubectl get seldondeployment -n <yourmodelnamespace> <seldondepname> -o yaml | grep image:
where<yourmodelnamespace>
]Logs of your model: [You can get the logs of your model by running
kubectl logs -n <yourmodelnamespace> <seldonpodname> <container>
]Logs of seldon-controller-manager: