BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
13.76k stars 1.62k forks source link

[Bug]: Ollama provider missing api_key #5832

Closed abhishek-singhal closed 1 month ago

abhishek-singhal commented 1 month ago

What happened?

ollama provider doesn't support api_key param. My self-hosted ollama instance is behind a gateway that requires header Authorization: Bearer {api_key}.

Since, ollama_chat provider already supports api_key. Is it possible to have it in ollama provider as well?

litellm/main.py

elif custom_llm_provider == "ollama":
            api_base = (
                litellm.api_base
                or api_base
                or get_secret("OLLAMA_API_BASE")
                or "http://localhost:11434"
            )
            custom_prompt_dict = custom_prompt_dict or litellm.custom_prompt_dict
            if model in custom_prompt_dict:
                # check if the model has a registered custom prompt
                model_prompt_details = custom_prompt_dict[model]
                prompt = custom_prompt(
                    role_dict=model_prompt_details["roles"],
                    initial_prompt_value=model_prompt_details["initial_prompt_value"],
                    final_prompt_value=model_prompt_details["final_prompt_value"],
                    messages=messages,
                )
            else:
                prompt = prompt_factory(
                    model=model,
                    messages=messages,
                    custom_llm_provider=custom_llm_provider,
                )
                if isinstance(prompt, dict):
                    # for multimode models - ollama/llava prompt_factory returns a dict {
                    #     "prompt": prompt,
                    #     "images": images
                    # }
                    prompt, images = prompt["prompt"], prompt["images"]
                    optional_params["images"] = images

            ## LOGGING
            generator = ollama.get_ollama_response(
                api_base=api_base,
                model=model,
                prompt=prompt,
                optional_params=optional_params,
                logging_obj=logging,
                acompletion=acompletion,
                model_response=model_response,
                encoding=encoding,
            )
            if acompletion is True or optional_params.get("stream", False) == True:
                return generator

            response = generator
elif custom_llm_provider == "ollama_chat":
            api_base = (
                litellm.api_base
                or api_base
                or get_secret("OLLAMA_API_BASE")
                or "http://localhost:11434"
            )

            api_key = (
                api_key
                or litellm.ollama_key
                or os.environ.get("OLLAMA_API_KEY")
                or litellm.api_key
            )
            ## LOGGING
            generator = ollama_chat.get_ollama_response(
                api_base=api_base,
                api_key=api_key,
                model=model,
                messages=messages,
                optional_params=optional_params,
                logging_obj=logging,
                acompletion=acompletion,
                model_response=model_response,
                encoding=encoding,
            )
            if acompletion is True or optional_params.get("stream", False) is True:
                return generator

            response = generator

Relevant log output

11:19:55 - LiteLLM Proxy:ERROR: proxy_server.py:2598 - litellm.proxy.proxy_server.async_data_generator(): Exception occured - b'<html>\r\n<head><title>401 Authorization Required</title></head>\r\n<body>\r\n<center><h1>401 Authorization Required</h1></center>\r\n<hr><center>nginx/1.18.0 (Ubuntu)</center>\r\n</body>\r\n</html>\r\n'
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/litellm/proxy/proxy_server.py", line 2577, in async_data_generator
    async for chunk in response:
  File "/usr/local/lib/python3.11/site-packages/litellm/llms/ollama.py", line 426, in ollama_async_streaming
    raise e  # don't use verbose_logger.exception, if exception is raised
    ^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/litellm/llms/ollama.py", line 381, in ollama_async_streaming
    raise OllamaError(
litellm.llms.ollama.OllamaError: b'<html>\r\n<head><title>401 Authorization Required</title></head>\r\n<body>\r\n<center><h1>401 Authorization Required</h1></center>\r\n<hr><center>nginx/1.18.0 (Ubuntu)</center>\r\n</body>\r\n</html>\r\n'

Twitter / LinkedIn details

No response

abhishek-singhal commented 1 month ago

I was able to use ollama_chat instead when added model through Rest API. Also, I saw using ollama_chat is actually recommended on your documentation: https://docs.litellm.ai/docs/providers/ollama