sgl-project / sglang

SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.
Apache License 2.0
2.8k stars 180 forks source link

Unable to load 72b llava qwen on 8*A100 40GB #507

Open jeffhernandez1995 opened 1 month ago

jeffhernandez1995 commented 1 month ago

Using the command: CUDA_VISIBLE_DEVICES=0,1,2,3 python -m sglang.launch_server --model-path lmms-lab/llava-next-72b --tokenizer-path lmms-lab/llavanext-qwen-tokenizer --port=8000 --host="0.0.0.0" --tp-size=4

Results in error:

torch.distributed.DistStoreError: Timed out after 601 seconds waiting for clients. 1/4 clients joined.
Initialization failed. detoken_init_state: init ok
ad1192214879 commented 3 weeks ago

解决了吗

jeffhernandez1995 commented 3 weeks ago

Sorry, I don't speak Chinese. Are you telling me to fix it? or if I managed to solve the issue?

StrangeTcy commented 6 days ago

Sorry, I don't speak Chinese. Are you telling me to fix it? or if I managed to solve the issue?

https://translate.google.com/?sl=auto&tl=en&text=%E8%A7%A3%E5%86%B3%E4%BA%86%E5%90%97&op=translate