zhanghang1989 / PyTorch-Encoding

A CV toolkit for my papers.
https://hangzhang.org/PyTorch-Encoding/
MIT License
2.04k stars 450 forks source link

Rebuild every time while importing #259

Open CorcovadoMing opened 4 years ago

CorcovadoMing commented 4 years ago

Hi,

I got lib folder re-compiling every time when executing import encoding How can I do to have a one-time installation without re-compiling all the CUDA codes?

zhanghang1989 commented 4 years ago

They are cached on my machine. Importing is fast at the second time.

I don't have this issue. Are you modifying the .so, .o, .ninja files by any chance? for example rsync or scp from another machine, which overwrites the cached files?

CorcovadoMing commented 4 years ago

No, I've pulled the clean repo again, and it didn't cache for me.

I do the following steps to install

git clone https://github.com/zhanghang1989/PyTorch-Encoding.git
cd PyTorch-Encoding
python setup.py install

I've also tried not to python setup.py install but directly import encoding in the pulled repo, still no cache

I saw the two directories are generated after importing in the repo, dist and torch_encoding.egg-info. But seems not related to the compilation

I do those steps in Docker with

zhanghang1989 commented 4 years ago

The .so, .o, .ninja should be generated at the first time when you import encoding. If not, may be the python path need administrator access. I use anaconda which does not require admin access.

Could you try using python setup.py develop? This will generate a link to the source folder.

CorcovadoMing commented 4 years ago

I found it did generate the caches, but it still compile repeatedly

I've enabled the verbose=True in __init__.py under lib to see what's going on:

>>> import encoding
Detected CUDA files, patching ldflags
Emitting ninja build file /root/PyTorch-Encoding/encoding/lib/gpu/build.ninja...
Building extension module enclib_gpu...
[1/7] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=enclib_gpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/lib/python3.6/site-packages/torch/include -isystem /opt/conda/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.6/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.6/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' --expt-extended-lambda -std=c++14 -c /root/PyTorch-Encoding/encoding/lib/gpu/roi_align_kernel.cu -o roi_align_kernel.cuda.o
...
[7/7] c++ operator.o activation_kernel.cuda.o encoding_kernel.cuda.o syncbn_kernel.cuda.o roi_align_kernel.cuda.o nms_kernel.cuda.o rectify_cuda.cuda.o -shared -L/opt/conda/lib/python3.6/site-packages/torch/lib -lc10 -lc10_cuda -ltorch_cpu -ltorch_cuda -ltorch -ltorch_python -L/usr/local/cuda/lib64 -lcudart -o enclib_gpu.so
Loading extension module enclib_gpu...

It ends up compile successfully and the compilation results are generated:

root@1132724:~/PyTorch-Encoding# ls encoding/lib/gpu/
activation_kernel.cu      common.h         encoding_kernel.cu      nms_kernel.cuda.o  operator.o           roi_align_kernel.cu      syncbn_kernel.cu
activation_kernel.cuda.o  device_tensor.h  encoding_kernel.cuda.o  operator.cpp       rectify_cuda.cu      roi_align_kernel.cuda.o  syncbn_kernel.cuda.o
build.ninja               enclib_gpu.so    nms_kernel.cu           operator.h         rectify_cuda.cuda.o  setup.py
root@1132724:~/PyTorch-Encoding# 

However, when I re-enter the python interpreter and import encoding, it starts to re-compile again!

Have you come up any idea why this happened?

zhanghang1989 commented 4 years ago

Sorry, I don't have that issue on my machine. Looks like pytorch issue. Maybe you can try using PyTorch 1.4.0, which I am using.