`meta/meta-llama-3-70b` ignores `max_tokens`

replicate / replicate-python

Python client for Replicate

https://replicate.com

Apache License 2.0

696 stars 194 forks source link

Open johny-b opened 1 month ago

johny-b commented 1 month ago

I'm pretty sure I'm sending max_tokens and:

When I use exactly the same code for e.g. meta/llama-2-70b this does not happen, i.e. I really get the requested number of tokens.