BerriAI / litellm

Python SDK, Proxy Server to call 100+ LLM APIs using the OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
12.04k stars 1.39k forks source link

[Bug]: GooseAI #584

Closed decentropy closed 10 months ago

decentropy commented 11 months ago

What happened?

Not sure I'd call this a bug... but I could not figure out how to connect with GooseAI.

GooseAI is openai compatible, and I can use successfully with openai python module, however this attempt gives me error

import os
import litellm

litellm.set_verbose=True

## set ENV variables
os.environ["OPENAI_API_KEY"] = "sk-XXXXXXXXXXXXXXXXXXXXXx" #removed

messages = [{ "content": "Hello, how are you?","role": "user"}]

response = litellm.completion(
    model="gpt-j-6b", 
    messages=[{ "content": "Hello, how are you?","role": "user"}],
    api_base="https://api.goose.ai/v1",
    custom_llm_provider="openai",
    temperature=0.2,
    max_tokens=80,
)
print(response)

Relevant log output

InvalidRequestError: invalid_request_error

Twitter / LinkedIn details

No response

krrishdholakia commented 11 months ago

@decentropy can you paste the full stack trace? I can't tell if this is litellm or openai throwing the error. If you received an InvalidRequestError then that's probably openai, but just want to be certain

decentropy commented 11 months ago

Sure, here you go.

LiteLLM: checking params for gpt-j-6b LiteLLM: params passed in {'functions': [], 'function_call': '', 'temperature': 0.2, 'top_p': None, 'n': None, 'stream': None, 'stop': None, 'max_tokens': 80, 'presence_penalty': None, 'frequency_penalty': None, 'logit_bias': {}, 'user': '', 'request_timeout': None, 'deployment_id': None, 'model': 'gpt-j-6b', 'custom_llm_provider': 'openai'} LiteLLM: non-default params passed in {'temperature': 0.2, 'max_tokens': 80} LiteLLM: self.optional_params: {'temperature': 0.2, 'max_tokens': 80} LiteLLM: Logging Details Pre-API Call for call id e9da6deb-a58a-42fb-980b-5fbf099b3a27 LiteLLM: model call details: {'model': 'gpt-j-6b', 'messages': [{'content': 'Hello, how are you?', 'role': 'user'}], 'optional_params': {'temperature': 0.2, 'max_tokens': 80}, 'litellm_params': {'return_async': False, 'api_key': None, 'force_timeout': 600, 'logger_fn': None, 'verbose': False, 'custom_llm_provider': 'openai', 'api_base': 'https://api.goose.ai/v1', 'litellm_call_id': 'e9da6deb-a58a-42fb-980b-5fbf099b3a27', 'model_alias_map': {}, 'completion_call_id': None, 'metadata': None, 'stream_response': {}}, 'input': [{'content': 'Hello, how are you?', 'role': 'user'}], 'api_key': 'sk-XXXXXXXXXXXXX', 'additional_args': {'headers': None, 'api_base': 'https://api.goose.ai/v1'}} LiteLLM: model call details: {'model': 'gpt-j-6b', 'messages': [{'content': 'Hello, how are you?', 'role': 'user'}], 'optional_params': {'temperature': 0.2, 'max_tokens': 80}, 'litellm_params': {'return_async': False, 'api_key': None, 'force_timeout': 600, 'logger_fn': None, 'verbose': False, 'custom_llm_provider': 'openai', 'api_base': 'https://api.goose.ai/v1', 'litellm_call_id': 'e9da6deb-a58a-42fb-980b-5fbf099b3a27', 'model_alias_map': {}, 'completion_call_id': None, 'metadata': None, 'stream_response': {}}, 'input': [{'content': 'Hello, how are you?', 'role': 'user'}], 'api_key': 'sk-XXXXXXXXXXXXX', 'additional_args': {'headers': None}, 'original_response': 'invalid_request_error'} LiteLLM: Logging Details Post-API Call: logger_fn - None | callable(logger_fn) - False


InvalidRequestError Traceback (most recent call last) Cell In[5], line 9 6 ## set ENV variables 7 os.environ["OPENAI_API_KEY"] = "sk-XXXXXXXXXXXXX" ----> 9 response = litellm.completion( 10 model="gpt-j-6b", 11 messages=[{ "content": "Hello, how are you?","role": "user"}], 12 api_base="https://api.goose.ai/v1", 13 custom_llm_provider="openai", 14 temperature=0.2, 15 max_tokens=80, 16 ) 17 print(response)

File ~/bin/miniconda3/envs/ai/lib/python3.11/site-packages/litellm/utils.py:748, in client..wrapper(*args, **kwargs) 744 if ( 745 liteDebuggerClient and liteDebuggerClient.dashboard_url != None 746 ): # make it easy to get to the debugger logs if you've initialized it 747 e.message += f"\n Check the log in your dashboard - {liteDebuggerClient.dashboard_url}" --> 748 raise e

File ~/bin/miniconda3/envs/ai/lib/python3.11/site-packages/litellm/utils.py:707, in client..wrapper(*args, *kwargs) 704 return cached_result 706 # MODEL CALL --> 707 result = original_function(args, **kwargs) 708 end_time = datetime.datetime.now() 709 if "stream" in kwargs and kwargs["stream"] == True: 710 # TODO: Add to cache for streaming

File ~/bin/miniconda3/envs/ai/lib/python3.11/site-packages/litellm/timeout.py:53, in timeout..decorator..wrapper(*args, **kwargs) 51 local_timeout_duration = kwargs["request_timeout"] 52 try: ---> 53 result = future.result(timeout=local_timeout_duration) 54 except futures.TimeoutError: 55 thread.stop_loop()

File ~/bin/miniconda3/envs/ai/lib/python3.11/concurrent/futures/_base.py:456, in Future.result(self, timeout) 454 raise CancelledError() 455 elif self._state == FINISHED: --> 456 return self.__get_result() 457 else: 458 raise TimeoutError()

File ~/bin/miniconda3/envs/ai/lib/python3.11/concurrent/futures/_base.py:401, in Future.__get_result(self) 399 if self._exception: 400 try: --> 401 raise self._exception 402 finally: 403 # Break a reference cycle with the exception in self._exception 404 self = None

File ~/bin/miniconda3/envs/ai/lib/python3.11/site-packages/litellm/timeout.py:42, in timeout..decorator..wrapper..async_func() 41 async def async_func(): ---> 42 return func(*args, **kwargs)

File ~/bin/miniconda3/envs/ai/lib/python3.11/site-packages/litellm/main.py:1180, in completion(model, messages, functions, function_call, temperature, top_p, n, stream, stop, max_tokens, presence_penalty, frequency_penalty, logit_bias, user, deployment_id, request_timeout, api_base, api_version, api_key, **kwargs) 1177 return response 1178 except Exception as e: 1179 ## Map to OpenAI Exception -> 1180 raise exception_type( 1181 model=model, custom_llm_provider=custom_llm_provider, original_exception=e, completion_kwargs=args, 1182 )

File ~/bin/miniconda3/envs/ai/lib/python3.11/site-packages/litellm/utils.py:2933, in exception_type(model, original_exception, custom_llm_provider, completion_kwargs) 2931 # don't let an error with mapping interrupt the user from receiving an error from the llm api calls 2932 if exception_mapping_worked: -> 2933 raise e 2934 else: 2935 raise original_exception

File ~/bin/miniconda3/envs/ai/lib/python3.11/site-packages/litellm/utils.py:2352, in exception_type(model, original_exception, custom_llm_provider, completion_kwargs) 2346 if "This model's maximum context length is" in original_exception._message: 2347 raise ContextWindowExceededError( 2348 message=str(original_exception), 2349 model=model, 2350 llm_provider=original_exception.llm_provider 2351 ) -> 2352 raise original_exception 2353 elif model: 2354 error_str = str(original_exception)

File ~/bin/miniconda3/envs/ai/lib/python3.11/site-packages/litellm/main.py:441, in completion(model, messages, functions, function_call, temperature, top_p, n, stream, stop, max_tokens, presence_penalty, frequency_penalty, logit_bias, user, deployment_id, request_timeout, api_base, api_version, api_key, **kwargs) 433 except Exception as e: 434 ## LOGGING - log the original exception returned 435 logging.post_call( 436 input=messages, 437 api_key=api_key, 438 original_response=str(e), 439 additional_args={"headers": litellm.headers}, 440 ) --> 441 raise e 443 if "stream" in optional_params and optional_params["stream"] == True: 444 response = CustomStreamWrapper(response, model, custom_llm_provider="openai", logging_obj=logging)

File ~/bin/miniconda3/envs/ai/lib/python3.11/site-packages/litellm/main.py:423, in completion(model, messages, functions, function_call, temperature, top_p, n, stream, stop, max_tokens, presence_penalty, frequency_penalty, logit_bias, user, deployment_id, request_timeout, api_base, api_version, api_key, kwargs) 410 response = openai_proxy_chat_completions.completion( 411 model=model, 412 messages=messages, (...) 420 logger_fn=logger_fn 421 ) 422 else: --> 423 response = openai.ChatCompletion.create( 424 model=model, 425 messages=messages, 426 headers=litellm.headers, # None by default 427 api_base=api_base, # thread safe setting base, key, api_version 428 api_key=api_key, 429 api_type="openai", 430 api_version=api_version, # default None 431 optional_params, 432 ) 433 except Exception as e: 434 ## LOGGING - log the original exception returned 435 logging.post_call( 436 input=messages, 437 api_key=api_key, 438 original_response=str(e), 439 additional_args={"headers": litellm.headers}, 440 )

File ~/bin/miniconda3/envs/ai/lib/python3.11/site-packages/openai/api_resources/chat_completion.py:25, in ChatCompletion.create(cls, *args, *kwargs) 23 while True: 24 try: ---> 25 return super().create(args, **kwargs) 26 except TryAgain as e: 27 if timeout is not None and time.time() > start + timeout:

File ~/bin/miniconda3/envs/ai/lib/python3.11/site-packages/openai/api_resources/abstract/engine_api_resource.py:155, in EngineAPIResource.create(cls, api_key, api_base, api_type, request_id, api_version, organization, params) 129 @classmethod 130 def create( 131 cls, (...) 138 params, 139 ): 140 ( 141 deployment_id, 142 engine, (...) 152 api_key, api_base, api_type, apiversion, organization, **params 153 ) --> 155 response, , api_key = requestor.request( 156 "post", 157 url, 158 params=params, 159 headers=headers, 160 stream=stream, 161 request_id=request_id, 162 request_timeout=request_timeout, 163 ) 165 if stream: 166 # must be an iterator 167 assert not isinstance(response, OpenAIResponse)

File ~/bin/miniconda3/envs/ai/lib/python3.11/site-packages/openai/api_requestor.py:299, in APIRequestor.request(self, method, url, params, headers, files, stream, request_id, request_timeout) 278 def request( 279 self, 280 method, (...) 287 request_timeout: Optional[Union[float, Tuple[float, float]]] = None, 288 ) -> Tuple[Union[OpenAIResponse, Iterator[OpenAIResponse]], bool, str]: 289 result = self.request_raw( 290 method.lower(), 291 url, (...) 297 request_timeout=request_timeout, 298 ) --> 299 resp, got_stream = self._interpret_response(result, stream) 300 return resp, got_stream, self.api_key

File ~/bin/miniconda3/envs/ai/lib/python3.11/site-packages/openai/api_requestor.py:710, in APIRequestor._interpret_response(self, result, stream) 702 return ( 703 self._interpret_response_line( 704 line, result.status_code, result.headers, stream=True 705 ) 706 for line in parse_stream(result.iter_lines()) 707 ), True 708 else: 709 return ( --> 710 self._interpret_response_line( 711 result.content.decode("utf-8"), 712 result.status_code, 713 result.headers, 714 stream=False, 715 ), 716 False, 717 )

File ~/bin/miniconda3/envs/ai/lib/python3.11/site-packages/openai/api_requestor.py:775, in APIRequestor._interpret_response_line(self, rbody, rcode, rheaders, stream) 773 stream_error = stream and "error" in resp.data 774 if stream_error or not 200 <= rcode < 300: --> 775 raise self.handle_error_response( 776 rbody, rcode, resp.data, rheaders, stream_error=stream_error 777 ) 778 return resp

InvalidRequestError: invalid_request_error

krrishdholakia commented 11 months ago

@decentropy that looks like an error being raised by openai

Looking at this:

File ~/bin/miniconda3/envs/ai/lib/python3.11/site-packages/openai/api_requestor.py:775, in APIRequestor._interpret_response_line(self, rbody, rcode, rheaders, stream)
773 stream_error = stream and "error" in resp.data
774 if stream_error or not 200 <= rcode < 300:
--> 775 raise self.handle_error_response(
776 rbody, rcode, resp.data, rheaders, stream_error=stream_error
777 )
778 return resp

Does this work when you use the openai-python sdk?

krrishdholakia commented 11 months ago

closing as this doesn't seem to be a litellm issue. @decentropy please re-open if this works with the openai-python sdk

decentropy commented 11 months ago

GooseAI did work with python openai library, but I'm not going to bother reopening... as I've moved to together.ai because gooseAI was not as good.

ishaan-jaff commented 11 months ago

@decentropy do you use LiteLLM for calling together AI ?

ishaan-jaff commented 11 months ago

Also check out deep infra https://docs.litellm.ai/docs/providers/deepinfra

krrishdholakia commented 11 months ago

@decentropy got it - i'll look into what might've caused this.

ishaan-jaff commented 10 months ago

closing due to inactivity