Open wohushihaoren opened 7 months ago
I was following the script to convert gemma and see the same error with tensort_llm version 0.9.0.dev2024022000.
I build chatglm-6b see the same error with TensorRT-LLM version: 0.9.0.dev2024022700
@syuoni Could you please take a look? Thanks
Hi @wohushihaoren , I tried to reproduce the issue with your commands on the current TensorRT-LLM main, but everything went well on my side. Could you please try with the current main branch?
According to the error information, the gemm_plug
is None (See https://github.com/NVIDIA/TensorRT-LLM/blob/main/tensorrt_llm/layers/linear.py#L63). It seems that something breaks in the package building process. Did you install TensorRT-LLM with pip install
or “build from source” (https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/build_from_source.md)? If it was the latter, I think you may try to re-build first.
Also, the error raises at gemm_plugin creation, could you try without gemm_plugin for engine building? i.e.,
trtllm-build --checkpoint_dir trt_ckpt/chatglm3_6b/fp16/1-gpu \
--output_dir trt_engines/chatglm3_6b/fp16/1-gpu
Please try to install the requirements.txt in the chatglm example first:
pip install -r examples/chatglm/requirements.txt
I had the same problem when I use trtllm-build for Baichuan-13B-Chat
System Info
Traceback (most recent call last):
File "/home/powerop/.conda/envs/bamboo/bin/trtllm-build", line 8, in
sys.exit(main())
File "/home/powerop/.conda/envs/bamboo/lib/python3.10/site-packages/tensorrt_llm/commands/build.py", line 489, in main
parallel_build(source, build_config, args.output_dir, workers,
File "/home/powerop/.conda/envs/bamboo/lib/python3.10/site-packages/tensorrt_llm/commands/build.py", line 413, in parallel_build
passed = build_and_save(rank, rank % workers, ckpt_dir,
File "/home/powerop/.conda/envs/bamboo/lib/python3.10/site-packages/tensorrt_llm/commands/build.py", line 385, in build_and_save
engine = build(build_config,
File "/home/powerop/.conda/envs/bamboo/lib/python3.10/site-packages/tensorrt_llm/commands/build.py", line 276, in build
return build_model(model, build_config)
File "/home/powerop/.conda/envs/bamboo/lib/python3.10/site-packages/tensorrt_llm/commands/build.py", line 193, in build_model
model(inputs)
File "/home/powerop/.conda/envs/bamboo/lib/python3.10/site-packages/tensorrt_llm/module.py", line 40, in call
output = self.forward(args, kwargs)
File "/home/powerop/.conda/envs/bamboo/lib/python3.10/site-packages/tensorrt_llm/models/modeling_utils.py", line 498, in forward
hidden_states = self.transformer.forward(kwargs)
File "/home/powerop/.conda/envs/bamboo/lib/python3.10/site-packages/tensorrt_llm/models/chatglm/model.py", line 253, in forward
layer_output = layer(
File "/home/powerop/.conda/envs/bamboo/lib/python3.10/site-packages/tensorrt_llm/module.py", line 40, in call
output = self.forward(args, kwargs)
File "/home/powerop/.conda/envs/bamboo/lib/python3.10/site-packages/tensorrt_llm/models/chatglm/model.py", line 117, in forward
attention_output = self.attention(
File "/home/powerop/.conda/envs/bamboo/lib/python3.10/site-packages/tensorrt_llm/module.py", line 40, in call
output = self.forward(*args, *kwargs)
File "/home/powerop/.conda/envs/bamboo/lib/python3.10/site-packages/tensorrt_llm/layers/attention.py", line 648, in forward
qkv = self.qkv(hidden_states, qkv_lora_params)
File "/home/powerop/.conda/envs/bamboo/lib/python3.10/site-packages/tensorrt_llm/module.py", line 40, in call
output = self.forward(args, **kwargs)
File "/home/powerop/.conda/envs/bamboo/lib/python3.10/site-packages/tensorrt_llm/layers/linear.py", line 139, in forward
return self.multiply_gather(x,
File "/home/powerop/.conda/envs/bamboo/lib/python3.10/site-packages/tensorrt_llm/layers/linear.py", line 115, in multiply_gather x = _gemm_plugin(x, File "/home/powerop/.conda/envs/bamboo/lib/python3.10/site-packages/tensorrt_llm/layers/linear.py", line 59, in _gemm_plugin layer = default_trtnet().add_plugin_v2(plug_inputs, gemm_plug) TypeError: add_plugin_v2(): incompatible function arguments. The following argument types are supported:
Invoked with: <tensorrt_bindings.tensorrt.INetworkDefinition object at 0x7f99758bf530>, [<tensorrt_bindings.tensorrt.ITensor object at 0x7f99 75a77db0>, <tensorrt_bindings.tensorrt.ITensor object at 0x7f99758bf870>], None
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
follow chatglm example instructions to build and then run
Expected behavior
build chatglm3-6b engine successfully
actual behavior
build engines failed, unknown error
additional notes
i have run
python3 convert_checkpoint.py --model_dir chatglm3_6b --output_dir trt_ckpt/chatglm3_6b/fp16/1-gpu
successfullyhowever, when i running
an error occured, like
please help me, thanks a lot