Open tuanlv14 opened 2 weeks ago
My python code:
import os
from litellm import completion
## set ENV variables
os.environ["OPENAI_API_KEY"] = My_KEY #key is not used for proxy
messages = [{ "content": "Hello, how are you?","role": "user"}]
response = completion(
model="openai/mistralai/mistral-large",
messages=[{ "content": "Hello, how are you?","role": "user"}],
api_base="https://integrate.api.nvidia.com/v1",
# custom_llm_provider="openai" # litellm will use the openai.ChatCompletion to make the request
)
print(response)
Response:
ModelResponse(id='chatcmpl-8bea69f9-ccfa-4d59-9912-47d7c579c9a3', choices=[Choices(finish_reason='stop', index=0, message=Message(content=" Hello! I'm just a computer program, so I don't have", role='assistant'), logprobs={'content': None, 'text_offset': [], 'token_logprobs': [0.0, 0.0], 'tokens': [], 'top_logprobs': []})], created=1717000089, model='mistralai/mistral-large', object='chat.completion', system_fingerprint=None, usage=Usage(completion_tokens=16, prompt_tokens=9, total_tokens=25))
But when I tried to config proxy as bellow:
This proxy will not be working. So pls help me check and fix error with LiteLLM proxy.
what's the error you see with the proxy? @tuanlv14
I was able to make it work using the proxy approach and the .yaml file below:
- model_name: llama-nvidia
litellm_params:
model: openai/meta/llama3-70b-instruct
api_base: https://integrate.api.nvidia.com/v1
api_key: nvapi-Dxxx
but there is another nvidia endpoint with a different format, and I am not sure how to do it, see: https://ai.api.nvidia.com/v1/vlm/microsoft/phi-3-vision-128k-instruct
The Feature
Pls add the method & proxy for NVIDIA API, which had example code:
Motivation, pitch
NVIDIA API still is free-trial and good speed.
Twitter / LinkedIn details
No response