Closed hustshawn closed 4 months ago
The path of the NIM cache is specified by NIM_CACHE_PATH
environment variable. The default is /opt/nim/.cache
as specified in the official docs https://docs.nvidia.com/nim/large-language-models/latest/configuration.html.
This environment variable is being set by the nimCache
value which is defaulted to /model-store
in the helm values.yaml
.
So you can change this value to whatever you want within the helm chart and the NIM will startup and look in that directory for the cache files. As long as you have created that cache in the format the NIM expects (the same format it downloads the files with) it should find them and load them. As long as your filesystem / pv / storage is mounted to the container at that directory it should work.
Hi team,
I am working on NIM deploy on Amazon EKS pattern. ref: https://github.com/awslabs/data-on-eks/issues/560
I tried to deploy the NIM container with helm chart, and I am using a shared storage (EFS) volume mount to
/model-store
to share between pods.I know the first time it needs to download model files from NGC, but later even I launch new pods with the volume, the NIM pods takes very long time (5+ minutes) to be ready to serve the request.
What I have done?
/model-store
path/model-store/ngc/hub/models--nim--meta--llama3-8b-instruct/blobs
, from the timestamp, the files inside are write at the very first time a NIM pod start and write to it./opt/nim/.cache/
is emptyWhat I expect? I would like to know the real path that the NIM container could use for model caching, so that as long as the pod starts, the container inside can start very fast.
More ideally, if someone from NVIDIA could explain the process how the NIM container start?
Below are some captured logs for reference
Pod Events
Full NIM Pod Logs