replicate / cog-llama-template

LLaMA Cog template
Apache License 2.0
305 stars 51 forks source link

Add vLLM as an inference engine #42

Closed moinnadeem closed 11 months ago

moinnadeem commented 11 months ago

This PR adds support for vLLMs engine (https://github.com/replicate/vllm-with-loras). Feedback welcome!