Open sherlcok314159 opened 1 week ago
I am getting the same error trying to build mistral for ChatRTX on linux using python build.py --model_dir './model/mistral/mistral7b_hf' --quant_ckpt_path './model/mistral/mistral7b_int4_quant_weights/mistral_tp1_rank0.npz' --dtype float16 --remove_input_padding --use_gpt_attention_plugin float16 --enable_context_fmha --use_gemm_plugin float16 --use_weight_only --weight_only_precision int4_awq --per_group --output_dir './model/mistral/mistral7b_int4_engine' --world_size 1 --tp_size 1 --parallel_build --max_input_len 7168 --max_batch_size 1 --max_output_len 1024
According to this.
I cannot reproduce this issue locally. Can you have a try on the latest main branch? And follow the install doc to correctly install TensorRT-LLM.
Did you use the local PC or the remote server without screen? Is there any command to check whether the TRT-LLM is correctly installed.
Did you use the local PC or the remote server without screen? Is there any command to check whether the TRT-LLM is correctly installed.
Remote server.
To check installation
python3 -c "import tensorrt_llm"
System Info
Who can help?
@byshiue @ncomly-nvidia
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
I run the following build script in the terminal of ubuntu 20.04 (connected via ssh and the ubuntu has a virtual screen by Xorg).
And the log:
I do a lot of search on the web. It looks like this problem is caused by the mpi. But why converting checkpoint needs a screen.
Expected behavior
Run well
actual behavior
See the above
additional notes
No