Closed lazyhope closed 1 month ago
@lazyhope unable to repro this - the Usage function has the functionality to handle the .get
. Can you give me a script for repro?
Here's how i tested it -
from litellm.types.utils import Choices, Message, ModelResponse, Usage
response_object = ModelResponse(
id="26c0ef045020429d9c5c9b078c01e564",
choices=[
Choices(
finish_reason="stop",
index=0,
message=Message(
content="Hello! I'm Litellm Bot, your helpful assistant. While I can't provide real-time weather updates, I can help you find a reliable weather service or guide you on how to check the weather on your device. Would you like assistance with that?",
role="assistant",
tool_calls=None,
function_call=None,
),
)
],
created=1722124652,
model="vertex_ai/mistral-large",
object="chat.completion",
system_fingerprint=None,
usage=Usage(prompt_tokens=32, completion_tokens=55, total_tokens=87),
)
model = "mistral-large@2407"
messages = [{"role": "user", "content": "Hey, hows it going???"}]
custom_llm_provider = "vertex_ai"
predictive_cost = completion_cost(
completion_response=response_object,
model=model,
messages=messages,
custom_llm_provider=custom_llm_provider,
)
assert predictive_cost > 0
@lazyhope unable to repro this - the Usage function has the functionality to handle the
.get
. Can you give me a script for repro?Here's how i tested it -
from litellm.types.utils import Choices, Message, ModelResponse, Usage response_object = ModelResponse( id="26c0ef045020429d9c5c9b078c01e564", choices=[ Choices( finish_reason="stop", index=0, message=Message( content="Hello! I'm Litellm Bot, your helpful assistant. While I can't provide real-time weather updates, I can help you find a reliable weather service or guide you on how to check the weather on your device. Would you like assistance with that?", role="assistant", tool_calls=None, function_call=None, ), ) ], created=1722124652, model="vertex_ai/mistral-large", object="chat.completion", system_fingerprint=None, usage=Usage(prompt_tokens=32, completion_tokens=55, total_tokens=87), ) model = "mistral-large@2407" messages = [{"role": "user", "content": "Hey, hows it going???"}] custom_llm_provider = "vertex_ai" predictive_cost = completion_cost( completion_response=response_object, model=model, messages=messages, custom_llm_provider=custom_llm_provider, ) assert predictive_cost > 0
It seems I was using a older version of the package, sorry for the false alarm!
@krrishdholakia I am now actually able to reproduce the error using instructor==1.3.5 and the latest litellm with the following code:
import os, asyncio
from instructor import from_litellm, Mode
from litellm import acompletion
from pydantic import BaseModel
class User(BaseModel):
name: str
client = from_litellm(acompletion, mode=Mode.MD_JSON)
asyncio.run(
client.chat.completions.create(
messages=[{"role": "user", "content": "Joe"}],
response_model=User,
api_key=os.getenv("GOOGLE_API_KEY"),
model="gemini/gemini-1.5-pro-exp-0801",
)
)
It seems like it happens during the second retry attempt, I will try to see if it's an issue with litellm or instructor
After some debugging it seems that once threads defined here starts:
https://github.com/BerriAI/litellm/blob/d0a68ab123a8fb5b3cc1c137f41ee1ae408571cb/litellm/utils.py#L1494-L1499
at some point result.get("usage")
inside https://github.com/BerriAI/litellm/blob/d0a68ab123a8fb5b3cc1c137f41ee1ae408571cb/litellm/litellm_core_utils/litellm_logging.py#L624-L628
changed from Usage
to CompletionUsage
(potentially by one of the running thread), which causes the error.
@krrishdholakia I think the rest may be beyond my knowledge so could you please take a look at it?
Thanks for the great work @lazyhope i'll take a look at this now
able to repro
this seems to only happen when using instructor - i wonder if it's modifying some param
@krrishdholakia Sorry for bothering again, but I found that in the latest version, when serving with FastAPI:
import os
from instructor import from_litellm, Mode
from litellm import acompletion, Usage
from pydantic import BaseModel
from fastapi import FastAPI
class User(BaseModel):
name: str
app = FastAPI()
aclient = from_litellm(acompletion, mode=Mode.MD_JSON)
@app.get("/")
async def get_res():
user = await aclient.chat.completions.create(
messages=[{"role": "user", "content": "Joe"}],
response_model=User,
api_key=os.getenv("GOOGLE_API_KEY"),
model="gemini/gemini-1.5-flash",
)
print(user._raw_response.usage)
The printed usage becomes CompletionUsage(completion_tokens=11, prompt_tokens=137, total_tokens=148)
again, but if I run the inner function code directly in ipython, the usage type is Usage
.
This indeterministic behaviour seems quite confusing to me and I suspect it has something to do with the thread management.
I'm getting the same error when using instructor and trying to get the completion_cost
litellm==1.43.18
instructor==1.3.7
import instructor
from litellm import Router, completion_cost
from pydantic import BaseModel, Field
router = <your_router_here>
instructions = <instruction_content_here>
class RefinedTopics(BaseModel):
topics: List[str] = Field(description="a list of refined topics")
llm = instructor.from_litellm(router.completion)
response = llm.chat.completions.create(
model=MODEL_NAME,
response_model=RefinedTopics,
max_retries=5,
messages=[
{
"role": "user",
"content": instructions,
}
],
)
cost = completion_cost(completion_response=response)
ERROR - Something went wrong 'RefinedTopics' object has no attribute 'get
Edit: (This works)
response, completion = llm.chat.completions.create_with_completion(
model=MODEL_NAME,
response_model=RefinedTopics,
max_retries=5,
messages=[
{
"role": "user",
"content": instructions,
}
],
)
cost = completion_cost(completion_response=completion)
What happened?
Commit 1553f7fa4844ea4d4117c7a75d165ca2e747b81a introduced some incompatibilities and causes
AttributeError: 'CompletionUsage' object has no attribute 'get'
thrown by https://github.com/BerriAI/litellm/blob/dc8f9e72414ed54f34197fe379810cef71e0847a/litellm/cost_calculator.py#L494Relevant log output
Twitter / LinkedIn details
No response