[SGLang] Serving llava-onevision-qwen2-72b-ov

uahic commented 2 months ago

Hi there,

first a quick disclaimer: I did never use SGLang before, but want to serve the model on our cluster for research purposes now;

Executing

python -m sglang.launch_server --model-path lmms-lab/llava-onevision-qwen2-72b-ov  --port 30000

results in:

OSError: lmms-lab/llava-onevision-qwen2-72b-ov does not appear to have a file named preprocessor_config.json. 
Checkout 'https://huggingface.co/lmms-lab/llava-onevision-qwen2-72b-ov/tree/main' for available files.

so maybe you dont offer this option at this moment;

Also tried

python -m sglang.launch_server_llavavid --model-path lmms-lab/llava-onevision-qwen2-72b-ov 
--tokenizer-path lmms-lab/llavanext-qwen-tokenizer --port 30000

which results in ValueError: Model architectures ['LlavaVidForCausalLM'] are not supported for now.

Can I even use the generic launch scripts from SGLang or do I have to run controller.py and sglang_worker.py from the llava_next repo? If so, what would be the 'sgl-endpoint' argument there? I am a little lost for now :D

uahic commented 2 months ago

Nvm. You updated the readme with some pointers to the usage ... dang.

wade0604 commented 2 weeks ago

+1

LLaVA-VL / LLaVA-NeXT

[SGLang] Serving llava-onevision-qwen2-72b-ov #164