Closed chrjxj closed 3 months ago
Thanks for reporting the issue.
If not using the default docker, please check that groupNormPlugin.so
is built correctly and all dependent libraries are on the correct path, e.g.,
Before building, update the addresses of "TRT" and "CUDA" according to your environment in file plugins/Makefile.config
TRT_LIBPATH
, e.g., in our dockerfile, TRT_LIBPATH=/usr/local/lib/python3.10/dist-packages/tensorrt_libs
thanks for the note. it's useful.
/usr/local/lib/python3.10/dist-packages/tensorrt_libs
the path and libs came from installation of trt-llm. ip install tensorrt-llm~=0.10 -U
SD workflow doesn't depend on tensorrt-llm
...
Used the https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/docker/build.sh
to build docker, and inside the docker docker.io/library/modelopt_examples:latest
, still the same:
[07/04/2024-01:11:32] [I] TensorRT version: 10.0.1
[07/04/2024-01:11:32] [I] Loading standard plugins
[07/04/2024-01:11:32] [I] Loading supplied plugin library: /workspace/examples/plugins/bin/groupNormPlugin.so
trtexec: symbol lookup error: /workspace/examples/plugins/bin/groupNormPlugin.so: undefined symbol: getPluginRegistry
Some env
xxx:/local/TensorRT-Model-Optimizer/diffusers/quantization# echo $TRT_LIBPATH
/usr/local/lib/python3.10/dist-packages/tensorrt_libs
xxx:/local/TensorRT-Model-Optimizer/diffusers/quantization# echo $LD_LIBRARY_PATH
/usr/local/lib/python3.10/dist-packages/tensorrt_libs:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
xxx:/local/TensorRT-Model-Optimizer/diffusers/quantization# ls /usr/local/lib/python3.10/dist-packages/tensorrt_libs
__init__.py __pycache__ libfp8convkernel.so libnvinfer.so libnvinfer.so.10 libnvinfer_builder_resource.so.10.1.0 libnvinfer_plugin.so.10 libnvonnxparser.so.10
xxx:/local/TensorRT-Model-Optimizer/diffusers/quantization# ls /workspace/examples/plugins/bin/groupNormPlugin.so
/workspace/examples/plugins/bin/groupNormPlugin.so
xxx:/local/TensorRT-Model-Optimizer/diffusers/quantization# grep -n -R --include="*.so" "getPluginRegistry" /workspace/examples/plugins/bin/
grep: /workspace/examples/plugins/bin/groupNormPlugin.so: binary file matches
grep: /workspace/examples/plugins/bin/FP8Conv2DPlugin.so: binary file matches
xxx:/local/TensorRT-Model-Optimizer/diffusers/quantization# grep -n -R --include="*.so" "getPluginRegistry" /usr/local/lib/python3.10/dist-packages/tensorrt_libs
grep: /usr/local/lib/python3.10/dist-packages/tensorrt_libs/libnvinfer.so: binary file matches
Unfortunately we cannot reproduce this error. We will have a new release next Wednesday. Please stay tuned for the new version.
Thanks for reporting the issue.
If not using the default docker, please check that
groupNormPlugin.so
is built correctly and all dependent libraries are on the correct path, e.g.,
- Follow readme:
Before building, update the addresses of "TRT" and "CUDA" according to your environment in file plugins/Makefile.config
- Check
TRT_LIBPATH
, e.g., in our dockerfile,TRT_LIBPATH=/usr/local/lib/python3.10/dist-packages/tensorrt_libs
I had the same problem, and set TRT_LIBPATH can solve it.
please close.
i guess, the root cause is: in the prebuild docker image,
FP8Conv2DPlugin.so
and groupNormPlugin.so
in /workspace/examples/plugins/bin/
folder missed the symbol (didn't link to lib) during the build time.
after i set TRT_LIBPATH=/usr/local/lib/python3.10/dist-packages/tensorrt_libs
and re-build those plugs, it works fine.
Env
nvcr.io/nvidia/pytorch:24.06-py3
Steps
1 make plugins and copy
plugins
folder to /workspace/examples/plugins2 adjust script
3 run FP8 workflow
./build_sdxl_8bit_engine.sh --format fp8
Issue - undefined symbol: getPluginRegistry