Open johny-b opened 1 month ago
I'm pretty sure I'm sending max_tokens and:
max_tokens
When I use exactly the same code for e.g. meta/llama-2-70b this does not happen, i.e. I really get the requested number of tokens.
meta/llama-2-70b
I'm pretty sure I'm sending
max_tokens
and:max_tokens
when looking at my prediction in the browserWhen I use exactly the same code for e.g.
meta/llama-2-70b
this does not happen, i.e. I really get the requested number of tokens.