Closed raspawar closed 1 day ago
@mattf @dglogo @JashG
it should only error if the base url is integrate.api.nvidia.com and the api key is missing.
![]()
it should only error if the base url is integrate.api.nvidia.com and the api key is missing.
By default base_url gets set to integrate.api.nvidia.com and when the nvidia distribution is up it expects API Key. For using NeMo microservices the API key is not necessary and base_url is updated only after the new post-training model is ready for inference(in that case base_url is set to something like: http://nim.test(expected http://nim.test/v1/completions)
By default base_url gets set to integrate.api.nvidia.com and when the nvidia distribution is up it expects API Key. For using NeMo microservices the API key is not necessary and base_url is updated only after the new post-training model is ready for inference(in that case base_url is set to something like: http://nim.test(expected http://nim.test/v1/completions)
is the issue that a single inference provider is being used for both (a) hosted inference w/ integrate.api.nvidia.com as well as (b) local inference w/ the a fine-tuned nvidia nim?
if that's the case, what about setting up multiple inference providers?
if that's not the case, will you provide an example distro and use case?
This issue has been automatically marked as stale because it has not had activity within 60 days. It will be automatically closed if no further activity occurs within 30 days.
This issue has been automatically closed due to inactivity. Please feel free to reopen if you feel it is still relevant!
System Info
llama-stack NVIDIA distribution
🐛 Describe the bug
The NVIDIA llm connector expects NVIDIA_API_KEY, which is not required for the other nvidia adapter and can be ignored. Error is generated from: https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/inference/nvidia/nvidia.py#L78
NVIDIA LLM connector requirements vary by use case:
NVIDIA_API_KEY
NVIDIA_CUSTOMIZER_URL
, No API key neededExpected behavior
NVIDIA_BASE_URL
exists withoutNVIDIA_API_KEY
, assume non-catalogue model and proceedNVIDIA_API_KEY
missing, warn user and let API handle auth errors Which approach would you prefer to implement?