Closed dyastremsky closed 3 weeks ago
It looks like OpenAI's chat completions API does not properly support client-side batching. See here: https://community.openai.com/t/batching-with-chatcompletion-endpoint/137723/2
The legacy completions API does, but it's not worth updating the code just to accommodate that API at this time. I'll close out this PR.
Enable client-side batching with the
--batch-size
arg. GenAI-Perf will batch the requests for the OpenAI service kind.Batching is already supported for rankings and embeddings. This PR expands this support to the completions and chat endpoints.
TODO: