Open ankush13r opened 1 month ago
Can you provide the full command? the requests should be passed as a string if bsz = 1 and tokenized_requests=False
. And vllm's completion endpoint supports both list[str, ...]
or tokenizedlist[list[int], ...]
inputs (or at least it used to).
We've encountered an issue while using a local endpoint deployed by Text Generation Inference (TGI) and the vLLM API. When evaluating using the local-completion format, we receive an error indicating that the input cannot be parsed into the required string format. The root cause of the problem is that the TGI and vLLM APIs expect the input prompt to be in string format, but currently, the messages are being passed in a list format. You can find the specific line causing the issue here.
To resolve this issue, we have found that applying a chat template with a tokenizer and converting it to a string before passing it as a prompt works effectively. We would like to know if there is already a solution for this issue, or if we should proceed by implementing this fix and submitting a pull request.
If you have any suggestions for alternative solutions or feedback on our proposed approach, please let us know.