NVIDIA / nim-deploy

A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deployment.
https://build.nvidia.com/
Apache License 2.0
49 stars 17 forks source link

Unable to deploy NIM via KServe #16

Open Syulin7 opened 1 week ago

Syulin7 commented 1 week ago

When deploying NIM via KServe, KServe sets the mounted PVC to read-only, which will cause the model download to fail.

    - mountPath: /mnt/models
      name: kserve-pvc-source
      readOnly: true

error:

ADDITIONAL INFORMATION: Meta Llama 3 Community License, Built with Meta Llama 3.
A copy of the Llama 3 license can be found under /opt/nim/MODEL_LICENSE.
Traceback (most recent call last):
  File "/usr/local/bin/nim-llm-check-cache-env", line 8, in <module>
    sys.exit(check_cache_dir())
  File "/usr/local/lib/python3.10/dist-packages/vllm_nvext/utils/caches.py", line 29, in check_cache_dir
    raise RuntimeError(f"Unable to write to NIM_CACHE_PATH ({cache_path})")
RuntimeError: Unable to write to NIM_CACHE_PATH (/mnt/models/cache)
Syulin7 commented 1 week ago

https://github.com/kserve/kserve/issues/3687

Currently the implementation for PVC model stores here expects that a PV containing model files is always mounted to a ServingRuntime as ReadOnly.