Open Freffles opened 2 weeks ago
I have encountered the same problem. Can max_tokens be modified? For example, can it be changed to 4096?
I have found the place for configuration. The max_tokens value can be adjusted by modifying the following file:./app/lib/.server/llm/constants.ts.
You could change the value there but then it would be applied to all Anthropic models. Might work for you if you only use Haiku 3 but for the broader user base it needs a more flexible solution.
Describe the bug
When attempting to use Anthropic / Haiku (claude-3-haiku-20240307) I get an error:
2024-11-05 15:51:59 bolt-ai-dev-1 | responseBody: '{"type":"error","error":{"type":"invalid_request_error","message":"max_tokens: 8000 > 4096, which is the maximum allowed number of output tokens for claude-3-haiku-20240307"}}'
If I use Haiku via openrouter, it works fine.
Seems that max_tokens for all anthropic models is assumed to be 8000 but for Haiku 3 it is, in fact, 4096
https://docs.anthropic.com/en/docs/about-claude/models#model-comparison-table
Link to the Bolt URL that caused the error
N/A
Steps to reproduce
Select Anthropic/Haiku 3 as the model Enter an Prompt Observe a popup in bolt Observe the log in docker desktop
Expected behavior
Expected that max_tokens value to be correctly set for Haiku
Screen Recording / Screenshot
024-11-05 15:51:59 bolt-ai-dev-1 | responseBody: '{"type":"error","error":{"type":"invalid_request_error","message":"max_tokens: 8000 > 4096, which is the maximum allowed number of output tokens for claude-3-haiku-20240307"}}'
Platform
Windows 10
Additional context
No response