Open staeiou opened 1 week ago
cc @gcalmettes since you made this change.
@staeiou Please make sure you are using the same version of vLLM/docs. The latest
code should support max_completion_tokens
, but if you're installing vLLM via PyPI, you should refer to the stable
version of the docs.
📚 The doc issue
in examples/offline_inference_openai.md, the linked examples/openai_example_batch.jsonl uses max_completion_tokens instead of max_tokens, causing an error when the example is run.
Suggest a potential alternative/fix
PR incoming
Before submitting a new issue...