pytorch / vision

Datasets, Transforms and Models specific to Computer Vision
https://pytorch.org/vision
BSD 3-Clause "New" or "Revised" License
15.99k stars 6.92k forks source link

"libtorch.so: error adding symbols: File in wrong format" when trying to compile torchvision on Linux on IBM z machine #4476

Open seafolk-cn opened 2 years ago

seafolk-cn commented 2 years ago

Hello Guys,

I'm getting below error while trying to compile torchvision on linux on IBM z14 machine. Could anyone help have a look on this?

My env:

redhat 8.4 on IBM LinuxONE gcc (GCC) 8.4.1 20200928 (Red Hat 8.4.1-1)

command sequence to compile:

wget https://download.pytorch.org/libtorch/nightly/cpu/libtorch-shared-with-deps-latest.zip unzip libtorch-shared-with-deps-latest.zip

git clone git://github.com/pytorch/vision.git cd vision mkdir build cd build

Add -DWITH_CUDA=on support for the CUDA if needed

cmake -DCMAKE_PREFIX_PATH=$PWD/../../libtorch .. make make install

cmake and make logs:



(base) [root@qiskit build]# cmake -DCMAKE_PREFIX_PATH=$PWD/../../libtorch .. 
-- Found PNG: /usr/lib64/libpng.so (found version "1.6.34") 
-- Found JPEG: /usr/lib64/libjpeg.so (found version "62") 
-- Configuring done
CMake Warning at CMakeLists.txt:73 (add_library):
  Cannot generate a safe runtime search path for target torchvision because
  files in some directories may conflict with libraries in implicit
  directories:

    runtime library [libpng16.so.16] in /usr/lib64 may be hidden by files in:
      /opt/anaconda/lib

  Some of these libraries may not be found correctly.

-- Generating done
-- Build files have been written to: /root/vision/build
(base) [root@qiskit build]# 
(base) [root@qiskit build]# 
(base) [root@qiskit build]# 
(base) [root@qiskit build]# 
(base) [root@qiskit build]# 
(base) [root@qiskit build]# 
(base) [root@qiskit build]# make 
Scanning dependencies of target torchvision
[  2%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/io/image/cpu/common_jpeg.cpp.o
[  5%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/io/image/cpu/decode_image.cpp.o
[  7%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/io/image/cpu/decode_jpeg.cpp.o
[ 10%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/io/image/cpu/decode_png.cpp.o
[ 12%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/io/image/cpu/encode_jpeg.cpp.o
[ 15%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/io/image/cpu/encode_png.cpp.o
[ 17%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/io/image/cpu/read_write_file.cpp.o
[ 20%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/io/image/cuda/decode_jpeg_cuda.cpp.o
[ 22%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/io/image/image.cpp.o
[ 25%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/models/alexnet.cpp.o
[ 27%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/models/densenet.cpp.o
[ 30%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/models/googlenet.cpp.o
[ 32%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/models/inception.cpp.o
[ 35%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/models/mnasnet.cpp.o
[ 37%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/models/mobilenet.cpp.o
[ 40%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/models/resnet.cpp.o
[ 42%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/models/shufflenetv2.cpp.o
[ 45%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/models/squeezenet.cpp.o
[ 47%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/models/vgg.cpp.o
[ 50%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/autograd/deform_conv2d_kernel.cpp.o
[ 52%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/autograd/ps_roi_align_kernel.cpp.o
[ 55%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/autograd/ps_roi_pool_kernel.cpp.o
[ 57%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/autograd/roi_align_kernel.cpp.o
[ 60%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/autograd/roi_pool_kernel.cpp.o
[ 62%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/cpu/deform_conv2d_kernel.cpp.o
[ 65%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/cpu/interpolate_aa_kernels.cpp.o
[ 67%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/cpu/nms_kernel.cpp.o
[ 70%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/cpu/ps_roi_align_kernel.cpp.o
[ 72%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/cpu/ps_roi_pool_kernel.cpp.o
[ 75%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/cpu/roi_align_kernel.cpp.o
[ 77%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/cpu/roi_pool_kernel.cpp.o
[ 80%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/deform_conv2d.cpp.o
[ 82%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/interpolate_aa.cpp.o
[ 85%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/nms.cpp.o
[ 87%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/ps_roi_align.cpp.o
[ 90%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/ps_roi_pool.cpp.o
[ 92%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/roi_align.cpp.o
[ 95%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/ops/roi_pool.cpp.o
[ 97%] Building CXX object CMakeFiles/torchvision.dir/torchvision/csrc/vision.cpp.o
In file included from /root/vision/torchvision/csrc/vision.cpp:1:
/root/vision/torchvision/csrc/vision.h:10:40: warning: ‘_register_ops’ initialized and declared ‘extern’
 extern "C" VISION_INLINE_VARIABLE auto _register_ops = &cuda_version;
                                        ^~~~~~~~~~~~~
[100%] Linking CXX shared library libtorchvision.so
/root/libtorch/lib/libtorch.so: error adding symbols: File in wrong format
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/torchvision.dir/build.make:680: libtorchvision.so] Error 1
make[1]: *** [CMakeFiles/Makefile2:95: CMakeFiles/torchvision.dir/all] Error 2
make: *** [Makefile:149: all] Error 2
(base) [root@qiskit build]# 

Thanks
Regards
vfdev-5 commented 2 years ago

@seafolk-cn most probably libtorch binaries are not suitable for your architecture. As a solution you can try to build pytorch from source and thus build libtorch binaries and finally build vision binaries. Here is a pointer on how to build pytorch from source:

HTH

anand97 commented 2 years ago

Hello! I am facing the same error on my Jetson Nano, I already have precompiled binaries for Libtorch which I've tested to be working with a sample program I made to run Yolov5 on my nano. Unfortunately while trying to build the torchvision library I get a linking error that says:

/home/user/pytorch/torch/lib/libtorch.so: error adding symbols: File in wrong format

What I understand is that if the Libtorch binaries are able to run on the device, then the architecture is correct? Please let me know if you'd like any additional information/logs.

I tried compilation with both -DWITH_CUDA=on too

LongchaoDa commented 12 months ago

Hi pal, i faced same problem, have you solved it?