Open tontan2545 opened 1 month ago
Hi @tontan2545. Sorry to hear that you're having this problem in production. Other than an OOM, the only other cause that comes to mind is the server handling an explicit POST /shutdown
request or SIGTERM
. Could this be the process being killed by an autoscaler?
Hi @mattt, thanks for the reply. That's quite a sound guess, but I structured my NGINX so that only request of path /predictions
is allowed to the pod in k8s. For SIGTERM
part, it would be great if you could guide me on how to know whether the k8s pod does that since there's no log of that in the kube-system and the pod itself.
Hi, I've been running a particular model in Kubernetes using Cog. Whenever we have high workloads (4-5 prediction in queue) the Cog model seems to be stopping without notifying the reason. We initially thought this was a memory issue, however upon further investigation we found that we still have plenty of memory left for it to be an issue. It would be great if you could provide any hypothesis on this issue, looking forward to be following them.
Here's an example of the log, keep it mind that we have multiple replicas running and we are displaying logs on every pods.
Note: There's no presence of
cog.server.runner
exception logs at all, just plain shutdown by cog http