How to use multi-gpu in running llava？

xiaocaoxu commented 10 months ago

System Info

GPU：3090 CUDA：12.2

Who can help?

@ncomly-nvidia @symphonylyh

Information

[X] The official example scripts
[ ] My own modified scripts

Tasks

[X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

build llm engine: python ../llama/build.py --model_dir /models/llava-llama-2-finetune_full-mmcm-2023-11-01-03-15-20/ --dtype float32 --remove_input_padding --use_gpt_attention_plugin float32 --enable_context_fmha --use_gemm_plugin float32 --output_dir /models/llava_trt/1.0/fp32/2-gpu/ --max_batch_size 1 --world_size 2 --tp_size 2 --max_prompt_embedding_table_size 576 build visual engine: python build_visual_engine.py --model_name llava-v1.5-7b --model_path /models/llava-llama-2-finetune_full-mmcm-2023-11-01-03-15-20 run: mpirun -n 2 --allow-run-as-root python run.py --max_new_tokens 512 --input_text "Question: which city is this? Answer:" --hf_model_dir /models/llava-llama-2-finetune_full-mmcm-2023-11-01-03-15-20 --visual_engine_dir visual_engines/llava-v1.5-7b --llm_engine_dir /models/llava_trt/1.0/fp32/2-gpu --decoder_llm

get error:

Expected behavior

LLaVA use multi-gpu

actual behavior

error

additional notes

None

kaiyux commented 9 months ago

Hi @xiaocaoxu , we pushed an update to the main branch which should contain the fix of this issue, can you please verify it on the latest main branch? Thank you.

nv-guomingz commented 1 week ago

Hi @xiaocaoxu do u still have further issue or question now? If not, we'll close it soon.

NVIDIA / TensorRT-LLM