Closed paul-gauthier closed 3 months ago
anthropic doesn't expose a tokenizer for claude-3, see: https://github.com/anthropics/anthropic-sdk-python/issues/375#issuecomment-1999982035
we're currently defaulting claude-3 to tiktoken - see here: https://github.com/BerriAI/litellm/blob/6b63b663b9de89139dd28203650f8443c39b6d9e/litellm/utils.py#L1479
I'd recommend having a buffer as it's possible there's a gap in accuracy
@paul-gauthier open to feedback on how we can do this better
Thanks for the reply. This is what I suspected, but wanted to confirm. Nothing much else to be done until Anthropic provides a tokenizer.
how're you planning on dealing with this? Wondering if there's anything we can do to help @paul-gauthier
One thing that might be nice is to provide a test that let's the caller know if the token counts are accurate or approximate.
Maybe something like this?
accurate = litellm.encode_is_accurate(model)
# True if tokenizer is known to be correct
# False if using a "best effort" approximate tokenizer
What happened?
I am getting user reports that Sonnet will sometimes stop generating tokens with an error indicating a token limit. Aider reports ~3k tokens have been output by the model, but Sonnet's output token limit is 4k. This has been hard to understand.
Aider uses
litellm.encode()
to count how many output tokens have been returned. Any chance it is using a wrong/approximate tokenizer and is therefore undercounting the tokens?See https://github.com/paul-gauthier/aider/issues/705 for an example user report of this issue.
Relevant log output
No response
Twitter / LinkedIn details
No response