Open airalcorn2 opened 10 months ago
Re: this error:
/usr/bin/ld: cannot find -ltorchtrt
The failing command was:
aarch64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 /TensorRT/build/temp.linux-aarch64-cpython-38/py/torch_tensorrt/csrc/register_tensorrt_classes.o /TensorRT/build/temp.linux-aarch64-cpython-38/py/torch_tensorrt/csrc/tensorrt_backend.o /TensorRT/build/temp.linux-aarch64-cpython-38/py/torch_tensorrt/csrc/tensorrt_classes.o /TensorRT/build/temp.linux-aarch64-cpython-38/py/torch_tensorrt/csrc/torch_tensorrt_py.o -L/TensorRT/py/torch_tensorrt/lib/ -L/opt/conda/lib/python3.6/config-3.6m-x86_64-linux-gnu -L/usr/local/lib/python3.8/dist-packages/torch/lib -L/usr/local/cuda/lib64 -L/usr/lib -ltorchtrt -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda -o build/lib.linux-aarch64-cpython-38/torch_tensorrt/_C.cpython-38-aarch64-linux-gnu.so -Wno-deprecated -Wno-deprecated-declarations -Wl,--no-as-needed -ltorchtrt -Wl,-rpath,$ORIGIN/lib -lpthread -ldl -lutil -lrt -lm -Xlinker -export-dynamic -D_GLIBCXX_USE_CXX11_ABI=1
I did:
cp bazel-bin/libtorchtrt.tar.gz .
tar -xzvf libtorchtrt.tar.gz
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/TensorRT/torch_tensorrt/lib
and then ran:
aarch64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 /TensorRT/build/temp.linux-aarch64-cpython-38/py/torch_tensorrt/csrc/register_tensorrt_classes.o /TensorRT/build/temp.linux-aarch64-cpython-38/py/torch_tensorrt/csrc/tensorrt_backend.o /TensorRT/build/temp.linux-aarch64-cpython-38/py/torch_tensorrt/csrc/tensorrt_classes.o /TensorRT/build/temp.linux-aarch64-cpython-38/py/torch_tensorrt/csrc/torch_tensorrt_py.o -L/TensorRT/py/torch_tensorrt/lib/ -L/opt/conda/lib/python3.6/config-3.6m-x86_64-linux-gnu -L/usr/local/lib/python3.8/dist-packages/torch/lib -L/usr/local/cuda/lib64 -L/usr/lib -L/TensorRT/torch_tensorrt/lib -ltorchtrt -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda -o build/lib.linux-aarch64-cpython-38/torch_tensorrt/_C.cpython-38-aarch64-linux-gnu.so -Wno-deprecated -Wno-deprecated-declarations -Wl,--no-as-needed -ltorchtrt -Wl,-rpath,$ORIGIN/lib -lpthread -ldl -lutil -lrt -lm -Xlinker -export-dynamic -D_GLIBCXX_USE_CXX11_ABI=1
and that command could run (notice the addition of -L/TensorRT/torch_tensorrt/lib
), but I don't know where to go next.
I went back to using the current version of WORKSPACE.jp50
and made a couple changes, including commenting out the pip_install
as suggested in the comment above that step. Using the attached Dockerfile and WORKSPACE.jp50
files, I was able to successfully build torch_tensorrt
. However, when trying to import torch_tensorrt
I received a new error of:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/site-packages/torch_tensorrt/__init__.py", line 84, in <module>
from torch_tensorrt._compile import * # noqa: F403
File "/usr/lib/python3.8/site-packages/torch_tensorrt/_compile.py", line 24, in <module>
from torch._export import ExportedProgram
ImportError: cannot import name 'ExportedProgram' from 'torch._export' (/usr/local/lib/python3.8/dist-packages/torch/_export/__init__.py)
so it appears my hack in the Dockerfile of:
RUN sed -i 's/2.1.dev/2.1/g' py/torch_tensorrt/__init__.py
was ill-advised. Is there a way to make Torch-TensorRT work with PyTorch 2.1.0? PyTorch 2.2 is currently only available for JetPack 6.0.
I followed the instructions under "Build from Source" here and that seemed to work. I used the attached Dockerfile and WORKSPACE
. Once torch_tensorrt
is installed, you have to use:
export PYTHONPATH=${PYTHONPATH}:/usr/lib/python3.8/site-packages
to be able to import it. I'm leaving this issue open because of the WORKSPACE.jp50
bug.
Bug Description
When trying to build Torch-TensorRT on a Jetson following the instructions here, I get errors that seem to be related to changes made to
WORSPACE.jp50
in this commit. The first error I get is:which is caused by this line. When I remove that line, I get the following error:
You can see in the commit that:
was removed, but
pip_install
is still used in the currentWORKSPACE.jp50
(here). In contrast,WORKSPACE
usespip_parse
(here). At this point, I just replacedWORKSPACE.jp50
with the older version, which got me much further, but I eventually met a different error:To Reproduce
Follow the Torch-TensorRT building instructions under "Building Natively on aarch64 (Jetson)" here.
Expected behavior
No errors related to the
WORKSPACE.jp50
file.Environment
A Jetson Xavier AGX and the
dustynv/ros:iron-pytorch-l4t-r35.3.1
container base image.Additional context