haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
https://llava.hliu.cc
Apache License 2.0
17.98k stars 1.95k forks source link

Production deployment #1107

Open ArunAniyan opened 4 months ago

ArunAniyan commented 4 months ago

Question

Hi, What is the best infrastructure and methodology to deploy llava for a production-grade application? Is a local application server like Ollama advisable? Do you know of other possible methods? Apart from ollama, llama.cpp is something that comes to mind. Have not tried triton-llm.

nivibilla commented 4 months ago

Sglang supports llava https://github.com/sgl-project/sglang