Issue with Pod Termination in InferenceService using nvcr.io/nim/meta/llama-3.1-8b-instruct:1.1.2 - No Terminate Signal Sent to Nim Server

NVIDIA / nim-deploy

A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deployment.

Apache License 2.0

141 stars 64 forks source link

Currently, I am using the nvcr.io/nim/meta/llama-3.1-8b-instruct:1.1.2 image to create an InferenceService following the provided guide, and the service is running properly. Pod creation and API calls are working fine, but I am encountering an issue when trying to delete the Pod.

It seems that the Terminate command is not being sent to the Nim server when I request the deletion of the InferenceService or Pod. There are no KILL signals in the internal logs either, and the Pod is only forcefully deleted when it reaches the terminationGracePeriodSeconds: 300.

Do I need to provide any additional options when starting the Nim server, or is this a known issue?

NVIDIA / nim-deploy

Issue with Pod Termination in InferenceService using nvcr.io/nim/meta/llama-3.1-8b-instruct:1.1.2 - No Terminate Signal Sent to Nim Server #91