Open Tchaikovic opened 3 months ago
yes not enough vram. generally anything u see "meta tensor" means ur tensor is offloaded to ram and thus not on vram
try to use flag --load_model_on_cpu
@nv-guomingz Thank you! Can I run inference on gpu if I use that flag?
@nv-guomingz Thank you! Can I run inference on gpu if I use that flag?
Yes. Please have a try to see if the issue still exists.
@nv-guomingz that issue is now resolved. I had to switch to TensorRT-LLM version: 0.11.0.dev2024052800
and TensorRT-LLM commit f430a4b447ef4cba22698902d43eae0debf08594 .
Now I get another error though at the second step when I run
trtllm-build \
--checkpoint_dir models/trt_${MODEL_NAME}/fp16/1-gpu \
--output_dir trt_engines/${MODEL_NAME}/int4_weightonly/1-gpu \
--gpt_attention_plugin float16 \
--gemm_plugin float16 \
--max_batch_size 1 \
--max_input_len 924 \
--max_output_len 100 \
--max_multimodal_len 576
what(): [TensorRT-LLM][ERROR] Assertion failed: Can't free tmp workspace for GEMM tactics profiling. (/home/jenkins/agent/workspace/LLM/main/L0_PostMerge/tensorrt_llm/cpp/tensorrt_llm/plugins/common/gemmPluginProfiler.cpp:190)
1 0x7f4273f4224f /usr/local/lib/python3.10/dist-packages/tensorrt_llm/libs/libnvinfer_plugin_tensorrt_llm.so(+0x7724f) [0x7f4273f4224f]
2 0x7f4273ff8cb6 tensorrt_llm::plugins::GemmPluginProfiler<tensorrt_llm::cutlass_extensions::CutlassGemmConfig, std::shared_ptr<tensorrt_llm::kernels::cutlass_kernels::CutlassFpAIntBGemmRunnerInterface>, tensorrt_llm::plugins::GemmIdCore, tensorrt_llm::plugins::GemmIdCoreHash>::freeTmpData() + 70
3 0x7f42740036d3 tensorrt_llm::plugins::GemmPluginProfiler<tensorrt_llm::cutlass_extensions::CutlassGemmConfig, std::shared_ptr<tensorrt_llm::kernels::cutlass_kernels::CutlassFpAIntBGemmRunnerInterface>, tensorrt_llm::plugins::GemmIdCore, tensorrt_llm::plugins::GemmIdCoreHash>::profileTactics(std::shared_ptr<tensorrt_llm::kernels::cutlass_kernels::CutlassFpAIntBGemmRunnerInterface> const&, nvinfer1::DataType const&, tensorrt_llm::plugins::GemmDims const&, tensorrt_llm::plugins::GemmIdCore const&) + 1363
4 0x7f4273fd8949 tensorrt_llm::plugins::WeightOnlyQuantMatmulPlugin::initialize() + 9
5 0x7f443f591a25 /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0x1065a25) [0x7f443f591a25]
6 0x7f443f51e0aa /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xff20aa) [0x7f443f51e0aa]
7 0x7f443f30afcf /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xddefcf) [0x7f443f30afcf]
8 0x7f443f30d07c /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xde107c) [0x7f443f30d07c]
9 0x7f443f30f071 /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xde3071) [0x7f443f30f071]
10 0x7f443ef5461c /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xa2861c) [0x7f443ef5461c]
11 0x7f443ef59837 /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xa2d837) [0x7f443ef59837]
12 0x7f443ef5a1af /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xa2e1af) [0x7f443ef5a1af]
13 0x7f43ea0a6478 /usr/local/lib/python3.10/dist-packages/tensorrt_bindings/tensorrt.so(+0xa6478) [0x7f43ea0a6478]
14 0x7f43ea0457a3 /usr/local/lib/python3.10/dist-packages/tensorrt_bindings/tensorrt.so(+0x457a3) [0x7f43ea0457a3]
15 0x55919ee8f10e /usr/bin/python3(+0x15a10e) [0x55919ee8f10e]
16 0x55919ee85a7b _PyObject_MakeTpCall + 603
17 0x55919ee9dacb /usr/bin/python3(+0x168acb) [0x55919ee9dacb]
18 0x55919ee7dcfa _PyEval_EvalFrameDefault + 24906
19 0x55919ee8f9fc _PyFunction_Vectorcall + 124
20 0x55919ee7a5d7 _PyEval_EvalFrameDefault + 10791
21 0x55919ee8f9fc _PyFunction_Vectorcall + 124
22 0x55919ee7845c _PyEval_EvalFrameDefault + 2220
23 0x55919ee8f9fc _PyFunction_Vectorcall + 124
24 0x55919ee7826d _PyEval_EvalFrameDefault + 1725
25 0x55919ee8f9fc _PyFunction_Vectorcall + 124
26 0x55919ee9e492 PyObject_Call + 290
27 0x55919ee7a5d7 _PyEval_EvalFrameDefault + 10791
28 0x55919ee8f9fc _PyFunction_Vectorcall + 124
29 0x55919ee9e492 PyObject_Call + 290
30 0x55919ee7a5d7 _PyEval_EvalFrameDefault + 10791
31 0x55919ee8f9fc _PyFunction_Vectorcall + 124
32 0x55919ee9e492 PyObject_Call + 290
33 0x55919ee7a5d7 _PyEval_EvalFrameDefault + 10791
34 0x55919ee8f9fc _PyFunction_Vectorcall + 124
35 0x55919ee7826d _PyEval_EvalFrameDefault + 1725
36 0x55919ee749c6 /usr/bin/python3(+0x13f9c6) [0x55919ee749c6]
37 0x55919ef6a256 PyEval_EvalCode + 134
38 0x55919ef95108 /usr/bin/python3(+0x260108) [0x55919ef95108]
39 0x55919ef8e9cb /usr/bin/python3(+0x2599cb) [0x55919ef8e9cb]
40 0x55919ef94e55 /usr/bin/python3(+0x25fe55) [0x55919ef94e55]
41 0x55919ef94338 _PyRun_SimpleFileObject + 424
42 0x55919ef93f83 _PyRun_AnyFileObject + 67
43 0x55919ef86a5e Py_RunMain + 702
44 0x55919ef5d02d Py_BytesMain + 45
45 0x7f4467e70d90 /lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f4467e70d90]
46 0x7f4467e70e40 __libc_start_main + 128
47 0x55919ef5cf25 _start + 37
[178bb4bafa40:01791] *** Process received signal ***
[178bb4bafa40:01791] Signal: Aborted (6)
[178bb4bafa40:01791] Signal code: (-6)
[178bb4bafa40:01791] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f4467e89520]
[178bb4bafa40:01791] [ 1] /lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7f4467edd9fc]
[178bb4bafa40:01791] [ 2] /lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7f4467e89476]
[178bb4bafa40:01791] [ 3] /lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7f4467e6f7f3]
[178bb4bafa40:01791] [ 4] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xa2b9e)[0x7f4450876b9e]
[178bb4bafa40:01791] [ 5] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae20c)[0x7f445088220c]
[178bb4bafa40:01791] [ 6] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xad1e9)[0x7f44508811e9]
[178bb4bafa40:01791] [ 7] /lib/x86_64-linux-gnu/libstdc++.so.6(__gxx_personality_v0+0x99)[0x7f4450881959]
[178bb4bafa40:01791] [ 8] /lib/x86_64-linux-gnu/libgcc_s.so.1(+0x16884)[0x7f4467b79884]
[178bb4bafa40:01791] [ 9] /lib/x86_64-linux-gnu/libgcc_s.so.1(_Unwind_Resume+0x12d)[0x7f4467b7a2dd]
[178bb4bafa40:01791] [10] /usr/local/lib/python3.10/dist-packages/tensorrt_llm/libs/libnvinfer_plugin_tensorrt_llm.so(_ZN12tensorrt_llm7plugins18GemmPluginProfilerINS_18cutlass_extensions17CutlassGemmConfigESt10shared_ptrINS_7kernels15cutlass_kernels33CutlassFpAIntBGemmRunnerInterfaceEENS0_10GemmIdCoreENS0_14GemmIdCoreHashEE14profileTacticsERKS8_RKN8nvinfer18DataTypeERKNS0_8GemmDimsERKS9_+0xd04)[0x7f4274003e84]
[178bb4bafa40:01791] [11] /usr/local/lib/python3.10/dist-packages/tensorrt_llm/libs/libnvinfer_plugin_tensorrt_llm.so(_ZN12tensorrt_llm7plugins27WeightOnlyQuantMatmulPlugin10initializeEv+0x9)[0x7f4273fd8949]
[178bb4bafa40:01791] [12] /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0x1065a25)[0x7f443f591a25]
[178bb4bafa40:01791] [13] /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xff20aa)[0x7f443f51e0aa]
[178bb4bafa40:01791] [14] /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xddefcf)[0x7f443f30afcf]
[178bb4bafa40:01791] [15] /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xde107c)[0x7f443f30d07c]
[178bb4bafa40:01791] [16] /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xde3071)[0x7f443f30f071]
[178bb4bafa40:01791] [17] /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xa2861c)[0x7f443ef5461c]
[178bb4bafa40:01791] [18] /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xa2d837)[0x7f443ef59837]
[178bb4bafa40:01791] [19] /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xa2e1af)[0x7f443ef5a1af]
[178bb4bafa40:01791] [20] /usr/local/lib/python3.10/dist-packages/tensorrt_bindings/tensorrt.so(+0xa6478)[0x7f43ea0a6478]
[178bb4bafa40:01791] [21] /usr/local/lib/python3.10/dist-packages/tensorrt_bindings/tensorrt.so(+0x457a3)[0x7f43ea0457a3]
[178bb4bafa40:01791] [22] /usr/bin/python3(+0x15a10e)[0x55919ee8f10e]
[178bb4bafa40:01791] [23] /usr/bin/python3(_PyObject_MakeTpCall+0x25b)[0x55919ee85a7b]
[178bb4bafa40:01791] [24] /usr/bin/python3(+0x168acb)[0x55919ee9dacb]
[178bb4bafa40:01791] [25] /usr/bin/python3(_PyEval_EvalFrameDefault+0x614a)[0x55919ee7dcfa]
[178bb4bafa40:01791] [26] /usr/bin/python3(_PyFunction_Vectorcall+0x7c)[0x55919ee8f9fc]
[178bb4bafa40:01791] [27] /usr/bin/python3(_PyEval_EvalFrameDefault+0x2a27)[0x55919ee7a5d7]
[178bb4bafa40:01791] [28] /usr/bin/python3(_PyFunction_Vectorcall+0x7c)[0x55919ee8f9fc]
[178bb4bafa40:01791] [29] /usr/bin/python3(_PyEval_EvalFrameDefault+0x8ac)[0x55919ee7845c]
[178bb4bafa40:01791] *** End of error message ***
Aborted (core dumped)
@nv-guomingz that issue is now resolved. I had to switch to
TensorRT-LLM version: 0.11.0.dev2024052800
and TensorRT-LLM commit f430a4b447ef4cba22698902d43eae0debf08594 .Now I get another error though at the second step when I run
trtllm-build \ --checkpoint_dir models/trt_${MODEL_NAME}/fp16/1-gpu \ --output_dir trt_engines/${MODEL_NAME}/int4_weightonly/1-gpu \ --gpt_attention_plugin float16 \ --gemm_plugin float16 \ --max_batch_size 1 \ --max_input_len 924 \ --max_output_len 100 \ --max_multimodal_len 576
Error log
what(): [TensorRT-LLM][ERROR] Assertion failed: Can't free tmp workspace for GEMM tactics profiling. (/home/jenkins/agent/workspace/LLM/main/L0_PostMerge/tensorrt_llm/cpp/tensorrt_llm/plugins/common/gemmPluginProfiler.cpp:190) 1 0x7f4273f4224f /usr/local/lib/python3.10/dist-packages/tensorrt_llm/libs/libnvinfer_plugin_tensorrt_llm.so(+0x7724f) [0x7f4273f4224f] 2 0x7f4273ff8cb6 tensorrt_llm::plugins::GemmPluginProfiler<tensorrt_llm::cutlass_extensions::CutlassGemmConfig, std::shared_ptr<tensorrt_llm::kernels::cutlass_kernels::CutlassFpAIntBGemmRunnerInterface>, tensorrt_llm::plugins::GemmIdCore, tensorrt_llm::plugins::GemmIdCoreHash>::freeTmpData() + 70 3 0x7f42740036d3 tensorrt_llm::plugins::GemmPluginProfiler<tensorrt_llm::cutlass_extensions::CutlassGemmConfig, std::shared_ptr<tensorrt_llm::kernels::cutlass_kernels::CutlassFpAIntBGemmRunnerInterface>, tensorrt_llm::plugins::GemmIdCore, tensorrt_llm::plugins::GemmIdCoreHash>::profileTactics(std::shared_ptr<tensorrt_llm::kernels::cutlass_kernels::CutlassFpAIntBGemmRunnerInterface> const&, nvinfer1::DataType const&, tensorrt_llm::plugins::GemmDims const&, tensorrt_llm::plugins::GemmIdCore const&) + 1363 4 0x7f4273fd8949 tensorrt_llm::plugins::WeightOnlyQuantMatmulPlugin::initialize() + 9 5 0x7f443f591a25 /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0x1065a25) [0x7f443f591a25] 6 0x7f443f51e0aa /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xff20aa) [0x7f443f51e0aa] 7 0x7f443f30afcf /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xddefcf) [0x7f443f30afcf] 8 0x7f443f30d07c /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xde107c) [0x7f443f30d07c] 9 0x7f443f30f071 /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xde3071) [0x7f443f30f071] 10 0x7f443ef5461c /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xa2861c) [0x7f443ef5461c] 11 0x7f443ef59837 /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xa2d837) [0x7f443ef59837] 12 0x7f443ef5a1af /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xa2e1af) [0x7f443ef5a1af] 13 0x7f43ea0a6478 /usr/local/lib/python3.10/dist-packages/tensorrt_bindings/tensorrt.so(+0xa6478) [0x7f43ea0a6478] 14 0x7f43ea0457a3 /usr/local/lib/python3.10/dist-packages/tensorrt_bindings/tensorrt.so(+0x457a3) [0x7f43ea0457a3] 15 0x55919ee8f10e /usr/bin/python3(+0x15a10e) [0x55919ee8f10e] 16 0x55919ee85a7b _PyObject_MakeTpCall + 603 17 0x55919ee9dacb /usr/bin/python3(+0x168acb) [0x55919ee9dacb] 18 0x55919ee7dcfa _PyEval_EvalFrameDefault + 24906 19 0x55919ee8f9fc _PyFunction_Vectorcall + 124 20 0x55919ee7a5d7 _PyEval_EvalFrameDefault + 10791 21 0x55919ee8f9fc _PyFunction_Vectorcall + 124 22 0x55919ee7845c _PyEval_EvalFrameDefault + 2220 23 0x55919ee8f9fc _PyFunction_Vectorcall + 124 24 0x55919ee7826d _PyEval_EvalFrameDefault + 1725 25 0x55919ee8f9fc _PyFunction_Vectorcall + 124 26 0x55919ee9e492 PyObject_Call + 290 27 0x55919ee7a5d7 _PyEval_EvalFrameDefault + 10791 28 0x55919ee8f9fc _PyFunction_Vectorcall + 124 29 0x55919ee9e492 PyObject_Call + 290 30 0x55919ee7a5d7 _PyEval_EvalFrameDefault + 10791 31 0x55919ee8f9fc _PyFunction_Vectorcall + 124 32 0x55919ee9e492 PyObject_Call + 290 33 0x55919ee7a5d7 _PyEval_EvalFrameDefault + 10791 34 0x55919ee8f9fc _PyFunction_Vectorcall + 124 35 0x55919ee7826d _PyEval_EvalFrameDefault + 1725 36 0x55919ee749c6 /usr/bin/python3(+0x13f9c6) [0x55919ee749c6] 37 0x55919ef6a256 PyEval_EvalCode + 134 38 0x55919ef95108 /usr/bin/python3(+0x260108) [0x55919ef95108] 39 0x55919ef8e9cb /usr/bin/python3(+0x2599cb) [0x55919ef8e9cb] 40 0x55919ef94e55 /usr/bin/python3(+0x25fe55) [0x55919ef94e55] 41 0x55919ef94338 _PyRun_SimpleFileObject + 424 42 0x55919ef93f83 _PyRun_AnyFileObject + 67 43 0x55919ef86a5e Py_RunMain + 702 44 0x55919ef5d02d Py_BytesMain + 45 45 0x7f4467e70d90 /lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f4467e70d90] 46 0x7f4467e70e40 __libc_start_main + 128 47 0x55919ef5cf25 _start + 37 [178bb4bafa40:01791] *** Process received signal *** [178bb4bafa40:01791] Signal: Aborted (6) [178bb4bafa40:01791] Signal code: (-6) [178bb4bafa40:01791] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f4467e89520] [178bb4bafa40:01791] [ 1] /lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7f4467edd9fc] [178bb4bafa40:01791] [ 2] /lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7f4467e89476] [178bb4bafa40:01791] [ 3] /lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7f4467e6f7f3] [178bb4bafa40:01791] [ 4] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xa2b9e)[0x7f4450876b9e] [178bb4bafa40:01791] [ 5] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae20c)[0x7f445088220c] [178bb4bafa40:01791] [ 6] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xad1e9)[0x7f44508811e9] [178bb4bafa40:01791] [ 7] /lib/x86_64-linux-gnu/libstdc++.so.6(__gxx_personality_v0+0x99)[0x7f4450881959] [178bb4bafa40:01791] [ 8] /lib/x86_64-linux-gnu/libgcc_s.so.1(+0x16884)[0x7f4467b79884] [178bb4bafa40:01791] [ 9] /lib/x86_64-linux-gnu/libgcc_s.so.1(_Unwind_Resume+0x12d)[0x7f4467b7a2dd] [178bb4bafa40:01791] [10] /usr/local/lib/python3.10/dist-packages/tensorrt_llm/libs/libnvinfer_plugin_tensorrt_llm.so(_ZN12tensorrt_llm7plugins18GemmPluginProfilerINS_18cutlass_extensions17CutlassGemmConfigESt10shared_ptrINS_7kernels15cutlass_kernels33CutlassFpAIntBGemmRunnerInterfaceEENS0_10GemmIdCoreENS0_14GemmIdCoreHashEE14profileTacticsERKS8_RKN8nvinfer18DataTypeERKNS0_8GemmDimsERKS9_+0xd04)[0x7f4274003e84] [178bb4bafa40:01791] [11] /usr/local/lib/python3.10/dist-packages/tensorrt_llm/libs/libnvinfer_plugin_tensorrt_llm.so(_ZN12tensorrt_llm7plugins27WeightOnlyQuantMatmulPlugin10initializeEv+0x9)[0x7f4273fd8949] [178bb4bafa40:01791] [12] /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0x1065a25)[0x7f443f591a25] [178bb4bafa40:01791] [13] /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xff20aa)[0x7f443f51e0aa] [178bb4bafa40:01791] [14] /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xddefcf)[0x7f443f30afcf] [178bb4bafa40:01791] [15] /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xde107c)[0x7f443f30d07c] [178bb4bafa40:01791] [16] /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xde3071)[0x7f443f30f071] [178bb4bafa40:01791] [17] /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xa2861c)[0x7f443ef5461c] [178bb4bafa40:01791] [18] /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xa2d837)[0x7f443ef59837] [178bb4bafa40:01791] [19] /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so.10(+0xa2e1af)[0x7f443ef5a1af] [178bb4bafa40:01791] [20] /usr/local/lib/python3.10/dist-packages/tensorrt_bindings/tensorrt.so(+0xa6478)[0x7f43ea0a6478] [178bb4bafa40:01791] [21] /usr/local/lib/python3.10/dist-packages/tensorrt_bindings/tensorrt.so(+0x457a3)[0x7f43ea0457a3] [178bb4bafa40:01791] [22] /usr/bin/python3(+0x15a10e)[0x55919ee8f10e] [178bb4bafa40:01791] [23] /usr/bin/python3(_PyObject_MakeTpCall+0x25b)[0x55919ee85a7b] [178bb4bafa40:01791] [24] /usr/bin/python3(+0x168acb)[0x55919ee9dacb] [178bb4bafa40:01791] [25] /usr/bin/python3(_PyEval_EvalFrameDefault+0x614a)[0x55919ee7dcfa] [178bb4bafa40:01791] [26] /usr/bin/python3(_PyFunction_Vectorcall+0x7c)[0x55919ee8f9fc] [178bb4bafa40:01791] [27] /usr/bin/python3(_PyEval_EvalFrameDefault+0x2a27)[0x55919ee7a5d7] [178bb4bafa40:01791] [28] /usr/bin/python3(_PyFunction_Vectorcall+0x7c)[0x55919ee8f9fc] [178bb4bafa40:01791] [29] /usr/bin/python3(_PyEval_EvalFrameDefault+0x8ac)[0x55919ee7845c] [178bb4bafa40:01791] *** End of error message *** Aborted (core dumped)
Did u rebuild the tensorrt-llm with the latest code base?
No, I pip installed it.
ok, let me try to reproduce it on my side firstly.
Confirmed that T4 doesn't support weights only mode at this moment. Please try volta+ arch .
These are the commands I run:
I get this error
Is this because there is not enough GPU memory? Any workaround for this?