NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
https://nvidia.github.io/TensorRT-LLM
Apache License 2.0
7.44k stars 802 forks source link

An error occurred in MPI_Init_thread when running sqlcoder #995

Open 2496289471 opened 5 months ago

2496289471 commented 5 months ago

System Info

nvidia A100 80G centos7 x86_64

Who can help?

@ncomly-nvidia @kaiyux @juney-nvidia

Information

Tasks

Reproduction

python hf_gpt_convert.py --model starcoder -i ./sqlrcoder -o ./c-model/sqlcoder --tensor-parallelism 1 --storage-type float16

python3 build.py \ --model_dir ./c-model/sqlcoder/1-gpu \ --remove_input_padding \ --use_gpt_attention_plugin \ --enable_context_fmha \ --use_gemm_plugin \ --parallel_build \ --output_dir sqlcoder_outputs_tp1 \

python ../run.py --engine_dir sqlcoder_outputs_tp1 --tokenizer_dir ./sqlcoder --input_text "input text" --max_output_len 200 --no_add_special_tokens

Expected behavior

output sql

actual behavior

1706490601024

additional notes

Whether it supports the sqlcoder series model, vllm can run sqlcoder directly as starcoder. I'm wondering if this error is related to the model itself?

byshiue commented 5 months ago

The error happens at MPI initialization. Do you use the docker image used in document? If not, might you take a try?