max_tokens error with Anthropic claude-3-haiku-20240307

Freffles commented 2 weeks ago

Describe the bug

When attempting to use Anthropic / Haiku (claude-3-haiku-20240307) I get an error:

2024-11-05 15:51:59 bolt-ai-dev-1 | responseBody: '{"type":"error","error":{"type":"invalid_request_error","message":"max_tokens: 8000 > 4096, which is the maximum allowed number of output tokens for claude-3-haiku-20240307"}}'

If I use Haiku via openrouter, it works fine.

Seems that max_tokens for all anthropic models is assumed to be 8000 but for Haiku 3 it is, in fact, 4096

https://docs.anthropic.com/en/docs/about-claude/models#model-comparison-table

Link to the Bolt URL that caused the error

N/A

Steps to reproduce

Select Anthropic/Haiku 3 as the model Enter an Prompt Observe a popup in bolt Observe the log in docker desktop

Expected behavior

Expected that max_tokens value to be correctly set for Haiku

Screen Recording / Screenshot

024-11-05 15:51:59 bolt-ai-dev-1 | responseBody: '{"type":"error","error":{"type":"invalid_request_error","message":"max_tokens: 8000 > 4096, which is the maximum allowed number of output tokens for claude-3-haiku-20240307"}}'

Platform

Windows 10

Additional context

No response

lonnys commented 6 days ago

I have encountered the same problem. Can max_tokens be modified? For example, can it be changed to 4096?

lonnys commented 6 days ago

I have found the place for configuration. The max_tokens value can be adjusted by modifying the following file:./app/lib/.server/llm/constants.ts.

Freffles commented 5 days ago

You could change the value there but then it would be applied to all Anthropic models. Might work for you if you only use Haiku 3 but for the broader user base it needs a more flexible solution.

coleam00 / bolt.new-any-llm