Open xiaocaoxu opened 10 months ago
Hi @xiaocaoxu , we pushed an update to the main branch which should contain the fix of this issue, can you please verify it on the latest main branch? Thank you.
Hi @xiaocaoxu do u still have further issue or question now? If not, we'll close it soon.
System Info
GPU:3090 CUDA:12.2
Who can help?
@ncomly-nvidia @symphonylyh
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
build llm engine: python ../llama/build.py --model_dir /models/llava-llama-2-finetune_full-mmcm-2023-11-01-03-15-20/ --dtype float32 --remove_input_padding --use_gpt_attention_plugin float32 --enable_context_fmha --use_gemm_plugin float32 --output_dir /models/llava_trt/1.0/fp32/2-gpu/ --max_batch_size 1 --world_size 2 --tp_size 2 --max_prompt_embedding_table_size 576 build visual engine: python build_visual_engine.py --model_name llava-v1.5-7b --model_path /models/llava-llama-2-finetune_full-mmcm-2023-11-01-03-15-20 run: mpirun -n 2 --allow-run-as-root python run.py --max_new_tokens 512 --input_text "Question: which city is this? Answer:" --hf_model_dir /models/llava-llama-2-finetune_full-mmcm-2023-11-01-03-15-20 --visual_engine_dir visual_engines/llava-v1.5-7b --llm_engine_dir /models/llava_trt/1.0/fp32/2-gpu --decoder_llm
get error:
Expected behavior
LLaVA use multi-gpu
actual behavior
error
additional notes
None