anarchy-ai / LLM-VM

irresponsible innovation. Try now at https://chat.dev/
https://anarchy.ai/
MIT License
465 stars 150 forks source link

add parallel sampling using vllm #409

Open daspartho opened 7 months ago

daspartho commented 7 months ago

close #370

adds support for parallel sampling using vllm library when num_return_sequences in generation kwargs is > 1 and the model is supported by vllm (currently all hf models in llm-vm)

TODO: handle dependencies

daspartho commented 7 months ago

made suggested changes. vllm_support is set to true by default and needs to be set false explicitly for unsupported models.