add parallel sampling using vllm

anarchy-ai / LLM-VM

irresponsible innovation. Try now at https://chat.dev/

https://anarchy.ai/

MIT License

481 stars 145 forks source link

add parallel sampling using vllm #409

Open daspartho opened 11 months ago

daspartho commented 11 months ago

close #370

adds support for parallel sampling using vllm library when num_return_sequences in generation kwargs is > 1 and the model is supported by vllm (currently all hf models in llm-vm)

TODO: handle dependencies

daspartho commented 11 months ago

made suggested changes. vllm_support is set to true by default and needs to be set false explicitly for unsupported models.