Open exa256 opened 4 months ago
Use Anthropic Claude through LiteLLM, the usage and cost gets reported
import instructor
from litellm import completion
from litellm import completion, completion_cost, cost_per_token
from pydantic import BaseModel
class User(BaseModel):
name: str
age: int
client = instructor.from_litellm(completion)
resp, completion = client.chat.completions.create_with_completion(
model="claude-3-opus-20240229",
max_tokens=1024,
messages=[
{
"role": "user",
"content": "Extract Jason is 25 years old.",
}
],
response_model=User,
)
assert isinstance(resp, User)
assert resp.name == "Jason"
assert resp.age == 25
usage = completion.usage
input_tokens = usage.prompt_tokens
output_tokens = usage.completion_tokens
total_tokens = usage.total_tokens
input_cost_usd, output_cost_usd = cost_per_token(model, prompt_tokens=input_tokens, completion_tokens=output_tokens)
completion_cost_usd = completion_cost(completion_response=raw_result)
Describe the solution you'd like Instructor should patch Claude's API and surface the
usage
dictionary as part of the output in the second tuple like so:
structured_output._raw_response.usage
works but doesn't take retries into account.
@jxnl maybe we attach cumulative usage data here? It's currently getting lost while processing response. https://github.com/jxnl/instructor/blob/081418d59a397b38a1b66fe58a64ef94f9124a6b/instructor/process_response.py#L97-L100
Describe the solution you'd like Instructor should patch Claude's API and surface the
usage
dictionary as part of the output in the second tuple like so:
structured_output._raw_response.usage
works but doesn't take retries into account.@jxnl maybe we attach cumulative usage data here? It's currently getting lost while processing response.
usage or other completion param doesn't work for Iterables
'list' object has no attribute '_raw_response'
Is your feature request related to a problem? Please describe. Currently, reporting usage dictionary from OpenAI API is supported as seen in this document and usage dictionary. https://python.useinstructor.com/concepts/usage/?h=token+usage
However, Claude API patch does not have this functionality, even though usage is available from a successful 200 response from Anthropic's server: 200 Response from https://docs.anthropic.com/en/api/messages
Describe the solution you'd like Instructor should patch Claude's API and surface the
usage
dictionary as part of the output in the second tuple like so: