intel / intel-extension-for-pytorch

A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
Apache License 2.0
1.61k stars 247 forks source link

Linking error when compiling v1.10.100 using Debian 11 #229

Open rafariossaa opened 2 years ago

rafariossaa commented 2 years ago

Hi, I am trying to compile this extension using debian 11. The compilation went well but in the linking stage I got the following error:

...
[100%] Linking CXX shared library packages/intel_extension_for_pytorch/lib/libintel-ext-pt-cpu.so
/usr/bin/ld: CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_conv_transpose.cpp.o:(.bss+0x0): multiple definition of `torch::jit::graph_rewrite::fuse_add_filter_v2'; CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_conv.cpp.o:(.bss+0x0): first defined here
/usr/bin/ld: CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_conv_transpose.cpp.o:(.bss+0x1): multiple definition of `torch::jit::graph_rewrite::fuse_add_filter_v1'; CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_conv.cpp.o:(.bss+0x1): first defined here
/usr/bin/ld: CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_conv_transpose.cpp.o:(.bss+0x2): multiple definition of `torch::jit::graph_rewrite::accumu_use_check'; CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_conv.cpp.o:(.bss+0x2): first defined here
/usr/bin/ld: CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_linear.cpp.o:(.bss+0x0): multiple definition of `torch::jit::graph_rewrite::fuse_add_filter_v2'; CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_conv.cpp.o:(.bss+0x0): first defined here
/usr/bin/ld: CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_linear.cpp.o:(.bss+0x1): multiple definition of `torch::jit::graph_rewrite::fuse_add_filter_v1'; CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_conv.cpp.o:(.bss+0x1): first defined here
/usr/bin/ld: CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_linear.cpp.o:(.bss+0x2): multiple definition of `torch::jit::graph_rewrite::accumu_use_check'; CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_conv.cpp.o:(.bss+0x2): first defined here
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/intel-ext-pt-cpu.dir/build.make:1262: packages/intel_extension_for_pytorch/lib/libintel-ext-pt-cpu.so] Error 1
make[1]: *** [CMakeFiles/Makefile2:549: CMakeFiles/intel-ext-pt-cpu.dir/all] Error 2

I compiled it inside a debian 11 container. To compile it I followed this steps:

$ docker run -it debian:11.3 bash

root@c99c4eaad248:/# apt-get update
root@c99c4eaad248:/# apt-get install git build-essential less vim python3 python3-pip

root@c99c4eaad248:/# git clone --recursive https://github.com/intel/intel-extension-for-pytorch
root@c99c4eaad248:/# cd intel-extension-for-pytorch
root@c99c4eaad248:/intel-extension-for-pytorch# git checkout v1.10.100
root@c99c4eaad248:/intel-extension-for-pytorch# git submodule sync
root@c99c4eaad248:/intel-extension-for-pytorch# git submodule update --init --recursive

root@c99c4eaad248:/intel-extension-for-pytorch# pip3 install numpy torch==1.10.2+cpu  -f https://download.pytorch.org/whl/torch_stable.html

root@c99c4eaad248:/intel-extension-for-pytorch# export USE_MKLDNN=ON
root@c99c4eaad248:/intel-extension-for-pytorch# python3 setup.py install
The extension will be built with AVX512.
Building Intel Extension for PyTorch. Version: 1.10.100+cpu
running install
running build
running build_py
...
[100%] Linking CXX shared library packages/intel_extension_for_pytorch/lib/libintel-ext-pt-cpu.so
/usr/bin/ld: CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_conv_transpose.cpp.o:(.bss+0x0): multiple definition of `torch::jit::graph_rewrite::fuse_add_filter_v2'; CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_conv.cpp.o:(.bss+0x0): first defined here
/usr/bin/ld: CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_conv_transpose.cpp.o:(.bss+0x1): multiple definition of `torch::jit::graph_rewrite::fuse_add_filter_v1'; CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_conv.cpp.o:(.bss+0x1): first defined here
/usr/bin/ld: CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_conv_transpose.cpp.o:(.bss+0x2): multiple definition of `torch::jit::graph_rewrite::accumu_use_check'; CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_conv.cpp.o:(.bss+0x2): first defined here
/usr/bin/ld: CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_linear.cpp.o:(.bss+0x0): multiple definition of `torch::jit::graph_rewrite::fuse_add_filter_v2'; CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_conv.cpp.o:(.bss+0x0): first defined here
/usr/bin/ld: CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_linear.cpp.o:(.bss+0x1): multiple definition of `torch::jit::graph_rewrite::fuse_add_filter_v1'; CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_conv.cpp.o:(.bss+0x1): first defined here
/usr/bin/ld: CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_linear.cpp.o:(.bss+0x2): multiple definition of `torch::jit::graph_rewrite::accumu_use_check'; CMakeFiles/intel-ext-pt-cpu.dir/torch_ipex/csrc/jit/graph_rewrite_conv.cpp.o:(.bss+0x2): first defined here
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/intel-ext-pt-cpu.dir/build.make:1262: packages/intel_extension_for_pytorch/lib/libintel-ext-pt-cpu.so] Error 1
make[1]: *** [CMakeFiles/Makefile2:549: CMakeFiles/intel-ext-pt-cpu.dir/all] Error 2

I saw in pytorch issues the collect_env.py scripts was used to gather information:

# python3 collect_env.py

Collecting environment information...
PyTorch version: 1.10.2+cpu
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: Debian GNU/Linux 11 (bullseye) (x86_64)
GCC version: (Debian 10.2.1-6) 10.2.1 20210110
Clang version: Could not collect
CMake version: version 3.22.4
Libc version: glibc-2.31

Python version: 3.9.2 (default, Feb 28 2021, 17:03:44)  [GCC 10.2.1 20210110] (64-bit runtime)
Python platform: Linux-4.19.0-20-cloud-amd64-x86_64-with-glibc2.31
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.22.3
[pip3] torch==1.10.2+cpu
[conda] Could not collect
ashahba commented 2 years ago

Hi @rafariossaa Thanks for filing this issue, I'll look into it.

ashahba commented 2 years ago

Hi @rafariossaa Here's how I got the wheel to build successfully and from there you can take it to any OS (Either Ubuntu or Debian) and install it.

FROM intel/intel-optimized-pytorch:1.10.0-conda as dev-base

ARG IPEX_BRANCH=v1.10.100
RUN USE_MKLDNN=ON && \
    git clone --recursive https://github.com/intel/intel-extension-for-pytorch -b ${IPEX_BRANCH} && \
    cd intel-extension-for-pytorch && \
    pip install --no-cache-dir numpy torch==1.10.2+cpu -f https://download.pytorch.org/whl/torch_stable.html && \
    python setup.py bdist_wheel && \
    pip install dist/*1.10.100*.whl

RUN python -c "import torch; import intel_extension_for_pytorch as ipex; print('torch:', torch.__version__,' ipex:',ipex.__version__)"

and you should have this output when last RUN command is executed:

torch: 1.10.2+cpu  ipex: 1.10.100+cpu

and then in a multi-stage build environment, you can take this very same wheel under dist and install it in your environment or Docker image of the choice.

I hope that helps.

rafariossaa commented 2 years ago

Hi @ashahba , The thing here is not as much as to get a intel-extension package but as being able to compile it in debian 11.

As I can see intel/intel-optimized-pytorch:1.10.0-conda is based on ubuntu 20.04 and it is using gcc 9.3, but in debian 11 gcc is at version 10.2.1.

$ docker run -it intel/intel-optimized-pytorch:1.10.0-conda bash

root@c24de7063da0:/# cat /etc/os-release 
NAME="Ubuntu"
VERSION="20.04.4 LTS (Focal Fossa)"

root@c24de7063da0:/# gcc --version
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
$ docker run -it debian:11.3 bash

root@7104382938f4:/# apt-get update && apt-get install build-essential
...
root@7104382938f4:/# cat /etc/debian_version 
11.3

root@7104382938f4:/# gcc --version
gcc (Debian 10.2.1-6) 10.2.1 20210110

I think this issue is originated in something in the code or the compilation of the dependencies. Were you able to reproduce the issue using debian 11 (gcc 10) ?

ashahba commented 2 years ago

@rafariossaa let me get a fresh setup again with Debian 11 + GCC 10 and will update here.

Thanks.

EikanWang commented 2 years ago

@ashahba , will you submit a PR to IPEX 1.12 release branch to address this issue?