Closed nicolasembleton closed 10 months ago
Also I think the default variable constants.py/MAX_CONCURRENCY
should be named DEFAULT_MAX_CONCURRENCY
and the global variable DEFAULT_BATCH_SIZE
should be named BATCH_SIZE
for consistency and readability.
Thank you for the catch, just pushed a fix.
In regards to renaming DEFAULT_BATCH_SIZE to BATCH_SIZE, the default carries a different meaning - unlike MAX_CONCURRENCY, we can specify the batch size used in each request, so the default is for when a request doesn't specify a batch size, so I'm not sure if that naming convention will be best
See here: https://github.com/runpod-workers/worker-vllm/blob/f324bef8a09ff24629fb107ae712989dab58fd25/src/utils.py#L11-L12
Happy to push a PR if it helps.