zhanghang1989 / PyTorch-Encoding

A CV toolkit for my papers.
https://hangzhang.org/PyTorch-Encoding/
MIT License
2.04k stars 450 forks source link

ninja: build stopped: subcommand failed. #162

Closed OilGao closed 4 years ago

OilGao commented 5 years ago

/home/gaoxy/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/pybind11/cast.h: In instantiation of ‘pybind11::object pybind11::detail::object_api::operator()(Args&& ...) const [with pybind11::return_value_policy policy = (pybind11::return_value_policy)1u; Args = {pybind11::handle&, pybind11::handle&}; Derived = pybind11::detail::accessor]’: /home/gaoxy/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/pybind11/pytypes.h:884:27: required from ‘pybind11::str pybind11::str::format(Args&& ...) const [with Args = {pybind11::handle&, pybind11::handle&}]’ /home/gaoxy/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/pybind11/pybind11.h:749:72: required from here /home/gaoxy/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/pybind11/cast.h:2096:74: error: no matching function for call to ‘collect_arguments(pybind11::handle&, pybind11::handle&)’ return detail::collect_arguments(std::forward(args)...).call(derived().ptr()); ^ /home/gaoxy/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/pybind11/cast.h:2096:74: note: candidates are: /home/gaoxy/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/pybind11/cast.h:2075:1: note: template<pybind11::return_value_policy policy, class ... Args, class> pybind11::detail::simple_collector pybind11::detail::collect_arguments(Args&& ...) simple_collector collect_arguments(Args &&...args) { ^ /home/gaoxy/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/pybind11/cast.h:2075:1: note: template argument deduction/substitution failed: /home/gaoxy/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/pybind11/cast.h:2082:1: note: template<pybind11::return_value_policy policy, class ... Args, class> pybind11::detail::unpacking_collector pybind11::detail::collect_arguments(Args&& ...) unpacking_collector collect_arguments(Args &&...args) { ^ /home/gaoxy/.conda/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/pybind11/cast.h:2082:1: note: template argument deduction/substitution failed: ninja: build stopped: subcommand failed.

OilGao commented 5 years ago

Package Version


certifi 2018.11.29 cffi 1.11.5
chardet 3.0.4
idna 2.8
mkl-fft 1.0.6
mkl-random 1.0.2
nose 1.3.7
numpy 1.15.4
olefile 0.46
Pillow 5.3.0
pip 18.1
pycparser 2.19
requests 2.21.0
scipy 1.2.0
setuptools 40.6.3
six 1.12.0
torch 1.0.0
torch-encoding 1.0.1
torchvision 0.2.1
tqdm 4.28.1
urllib3 1.24.1
wheel 0.32.3

zilongzhong commented 5 years ago

Using cuda9.2 can kill this bug and make sure your environment path is correct.

cqq0505 commented 5 years ago

hello, I met the exact same problem with u. Have u solved your problem yet? I install cuda9.2 on virtual envs ,but it doesn't work. (The cuda version of my server is cuda9.0.

cqq0505 commented 5 years ago

Using cuda9.2 can kill this bug and make sure your environment path is correct.

can I re-install cuda9.2 only on my virtual envs by command: conda install cudatoolkit=9.2? ( the server uses cuda9.0)

zilongzhong commented 5 years ago

Try uninstall cuda9.0 and install cuda9.2 as follows: dpkg -l | grep cuda- | awk '{print $2}' | xargs -n1 sudo dpkg --purge dpkg --install cuda-repo-ubuntu-9.-local*.deb sudo apt-get update sudo apt-get install cuda-9-2

On Mon, 21 Jan 2019 at 09:42, conniecai notifications@github.com wrote:

Using cuda9.2 can kill this bug and make sure your environment path is correct.

can I re-install cuda9.2 only on my virtual envs by command: conda install cudatoolkit=9.2? ( the server uses cuda9.0)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/zhanghang1989/PyTorch-Encoding/issues/162#issuecomment-456096936, or mute the thread https://github.com/notifications/unsubscribe-auth/APQzztecqE_B99rde9I13sWNOGlpET1Wks5vFdHogaJpZM4ZgRkN .

cqq0505 commented 5 years ago

Try uninstall cuda9.0 and install cuda9.2 as follows: dpkg -l | grep cuda- | awk '{print $2}' | xargs -n1 sudo dpkg --purge dpkg --install cuda-repo-ubuntu-9.-local*.deb sudo apt-get update sudo apt-get install cuda-9-2 On Mon, 21 Jan 2019 at 09:42, conniecai @.***> wrote: Using cuda9.2 can kill this bug and make sure your environment path is correct. can I re-install cuda9.2 only on my virtual envs by command: conda install cudatoolkit=9.2? ( the server uses cuda9.0) — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#162 (comment)>, or mute the thread https://github.com/notifications/unsubscribe-auth/APQzztecqE_B99rde9I13sWNOGlpET1Wks5vFdHogaJpZM4ZgRkN .

so grateful for your reply! Can I just re-install cuda9.2 on my virtual envs? Because the server is public, I can't update it casually.

zilongzhong commented 5 years ago

Haven't try that, maybe you could change environment variables accordingly to achieve it.

On Mon, 21 Jan 2019 at 10:33, conniecai notifications@github.com wrote:

Try uninstall cuda9.0 and install cuda9.2 as follows: dpkg -l | grep cuda- | awk '{print $2}' | xargs -n1 sudo dpkg --purge dpkg --install cuda-repo-ubuntu-9.-local*.deb sudo apt-get update sudo apt-get install cuda-9-2 … <#m-9021537142732788438> On Mon, 21 Jan 2019 at 09:42, conniecai @.***> wrote: Using cuda9.2 can kill this bug and make sure your environment path is correct. can I re-install cuda9.2 only on my virtual envs by command: conda install cudatoolkit=9.2? ( the server uses cuda9.0) — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#162 (comment) https://github.com/zhanghang1989/PyTorch-Encoding/issues/162#issuecomment-456096936>, or mute the thread https://github.com/notifications/unsubscribe-auth/APQzztecqE_B99rde9I13sWNOGlpET1Wks5vFdHogaJpZM4ZgRkN .

so grateful for your reply! Can I just re-install cuda9.2 on my virtual envs? Because the server is public, I can't update it casually.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/zhanghang1989/PyTorch-Encoding/issues/162#issuecomment-456113211, or mute the thread https://github.com/notifications/unsubscribe-auth/APQzzukB-En4tRfDjL7SO9vytSm7mWaLks5vFd3FgaJpZM4ZgRkN .

cqq0505 commented 5 years ago

Haven't try that, maybe you could change environment variables accordingly to achieve it.

thanks for your help! I install another cuda version(9.2) on the server, which can kill this bug.

saikatkundu36 commented 5 years ago

**Hi

I am also facing similar issue. Initially when i executed the train.py with custom parameters, it provided the below error**

Saikat@Ganga:~/doc_image/ShelfNet/experiments/segmentation$ python train.py --diflr False --backbone resnet50 --dataset citys --checkname ShelfNet50_citys_coarse --lr-schedule step --batch-size 1 --epochs 1 Traceback (most recent call last): File "train.py", line 20, in import encoding.utils as utils File "../../encoding/init.py", line 12, in from .version import version ModuleNotFoundError: No module named 'encoding.version'

Then i commented the "encoding.version" line itprovide the below error:

(shelfnet) Saikat@Ganga:~/doc_image/ShelfNet/experiments/segmentation$ python train.py --diflr False --backbone resnet50 --dataset citys --checkname ShelfNet50_citys_coarse --lr-schedule step --batch-size 1 --epochs 1 ['../../', '/home/Saikat/doc_image/ShelfNet/experiments/segmentation', '/home/Saikat/miniconda3/envs/shelfnet/lib/python37.zip', '/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7', '/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/lib-dynload', '/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages', '/home/Saikat/doc_image/ShelfNet'] Traceback (most recent call last): File "train.py", line 23, in import encoding.utils as utils File "../../encoding/init.py", line 12, in from .version import version ModuleNotFoundError: No module named 'encoding.version' (shelfnet) Saikat@Ganga:~/doc_image/ShelfNet/experiments/segmentation$ (shelfnet) Saikat@Ganga:~/doc_image/ShelfNet/experiments/segmentation$ (shelfnet) Saikat@Ganga:~/doc_image/ShelfNet/experiments/segmentation$ python train.py --diflr False --backbone resnet50 --dataset citys --checkname ShelfNet50_citys_coarse --lr-schedule step --batch-size 1 --epochs 1 ['../../', '/home/Saikat/doc_image/ShelfNet/experiments/segmentation', '/home/Saikat/miniconda3/envs/shelfnet/lib/python37.zip', '/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7', '/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/lib-dynload', '/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages', '/home/Saikat/doc_image/ShelfNet'] Traceback (most recent call last): File "/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 946, in _build_extension_module check=True) File "/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/subprocess.py", line 487, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "train.py", line 23, in import encoding.utils as utils File "../../encoding/init.py", line 13, in from . import nn, functions, dilated, parallel, utils, models, datasets File "../../encoding/nn/init.py", line 12, in from .encoding import File "../../encoding/nn/encoding.py", line 18, in from ..functions import scaledL2, aggregate, pairwise_cosine File "../../encoding/functions/init.py", line 2, in from .encoding import File "../../encoding/functions/encoding.py", line 14, in from .. import lib File "../../encoding/lib/init.py", line 12, in ], build_directory=cpu_path, verbose=False) File "/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 645, in load is_python_module) File "/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 814, in _jit_compile with_cuda=with_cuda) File "/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 863, in _write_ninja_file_and_build _build_extension_module(name, build_directory, verbose) File "/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 959, in _build_extension_module raise RuntimeError(message) RuntimeError: Error building extension 'enclib_cpu': [1/3] c++ -MMD -MF roi_align_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/TH -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/THC -isystem /home/Saikat/miniconda3/envs/shelfnet/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp -o roi_align_cpu.o FAILED: roi_align_cpu.o c++ -MMD -MF roi_align_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/TH -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/THC -isystem /home/Saikat/miniconda3/envs/shelfnet/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp -o roi_align_cpu.o /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp: In function ‘at::Tensor ROIAlignForwardCPU(const at::Tensor&, const at::Tensor&, int64_t, int64_t, double, int64_t)’: /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:407:30: error: ‘struct at::Type’ has no member named ‘tensor’; did you mean ‘renorm’? auto output = input.type().tensor({num_rois, channels, pooled_height, pooled_width}); ^~ In file included from /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/ATen/ATen.h:9:0, from /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:1: /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function: /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:425:27: error: expected primary-expression before ‘>’ token output.data()); ^ /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:425:29: error: expected primary-expression before ‘)’ token output.data()); ^ /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function: /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:425:27: error: expected primary-expression before ‘>’ token output.data()); ^ /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:425:29: error: expected primary-expression before ‘)’ token output.data()); ^ /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp: In function ‘at::Tensor ROIAlignBackwardCPU(const at::Tensor&, const at::Tensor&, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, double, int64_t)’: /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:454:37: error: ‘struct at::Type’ has no member named ‘tensor’; did you mean ‘renorm’? auto grad_in = bottom_rois.type().tensor({bsize, channels, height, width}).zero(); ^~ In file included from /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/ATen/ATen.h:9:0, from /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:1: /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function: /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:470:28: error: expected primary-expression before ‘>’ token grad_in.data(), ^ /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:470:30: error: expected primary-expression before ‘)’ token grad_in.data(), ^ /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function: /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:470:28: error: expected primary-expression before ‘>’ token grad_in.data(), ^ /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:470:30: error: expected primary-expression before ‘)’ token grad_in.data(), ^ [2/3] c++ -MMD -MF roi_align.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/TH -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/THC -isystem /home/Saikat/miniconda3/envs/shelfnet/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align.cpp -o roi_align.o In file included from /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align.cpp:1:0: /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/torch/csrc/api/include/torch/torch.h:7:2: warning: #warning "Including torch/torch.h for C++ extensions is deprecated. Please include torch/extension.h" [-Wcpp]

warning \

^~~ ninja: build stopped: subcommand failed.

The results are coming fine when I tested the some cityscape images with some pretrained weights, so I wanted to train for cityscape images. I am using cuda 10.0 with pytorch version 1. Any help is appreciated. Thanks

FabianIsensee commented 5 years ago

Hi, I have got the same problem:


In [1]: import encoding                                                                                                                                                                                                                       
---------------------------------------------------------------------------
CalledProcessError                        Traceback (most recent call last)
~/dl_venv_python3/lib/python3.6/site-packages/torch/utils/cpp_extension.py in _build_extension_module(name, build_directory, verbose)
    945                 cwd=build_directory,
--> 946                 check=True)
    947         else:

/usr/lib/python3.6/subprocess.py in run(input, timeout, check, *popenargs, **kwargs)
    417             raise CalledProcessError(retcode, process.args,
--> 418                                      output=stdout, stderr=stderr)
    419     return CompletedProcess(process.args, retcode, stdout, stderr)

CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

RuntimeError                              Traceback (most recent call last)
<ipython-input-1-4f6e8fe7bd37> in <module>
----> 1 import encoding

~/dl_venv_python3/lib/python3.6/site-packages/encoding/__init__.py in <module>
     11 """An optimized PyTorch package with CUDA backend."""
     12 from .version import __version__
---> 13 from . import nn, functions, parallel, utils, models, datasets, transforms

~/dl_venv_python3/lib/python3.6/site-packages/encoding/nn/__init__.py in <module>
     10 
     11 """Encoding NN Modules"""
---> 12 from .encoding import *
     13 from .syncbn import *
     14 from .customize import *

~/dl_venv_python3/lib/python3.6/site-packages/encoding/nn/encoding.py in <module>
     16 from torch.nn.modules.utils import _pair
     17 
---> 18 from ..functions import scaled_l2, aggregate, pairwise_cosine
     19 
     20 __all__ = ['Encoding', 'EncodingDrop', 'Inspiration', 'UpsampleConv2d']

~/dl_venv_python3/lib/python3.6/site-packages/encoding/functions/__init__.py in <module>
      1 """Encoding Autograd Fuctions"""
----> 2 from .encoding import *
      3 from .syncbn import *
      4 from .customize import *

~/dl_venv_python3/lib/python3.6/site-packages/encoding/functions/encoding.py in <module>
     12 from torch.autograd import Function, Variable
     13 import torch.nn.functional as F
---> 14 from .. import lib
     15 
     16 __all__ = ['aggregate', 'scaled_l2', 'pairwise_cosine']

~/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/__init__.py in <module>
     13         os.path.join(cpu_path, 'roi_align_cpu.cpp'),
     14         os.path.join(cpu_path, 'nms_cpu.cpp'),
---> 15     ], build_directory=cpu_path, verbose=False)
     16 
     17 if torch.cuda.is_available():

~/dl_venv_python3/lib/python3.6/site-packages/torch/utils/cpp_extension.py in load(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module)
    643         verbose,
    644         with_cuda,
--> 645         is_python_module)
    646 
    647 

~/dl_venv_python3/lib/python3.6/site-packages/torch/utils/cpp_extension.py in _jit_compile(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module)
    812                     build_directory=build_directory,
    813                     verbose=verbose,
--> 814                     with_cuda=with_cuda)
    815             finally:
    816                 baton.release()

~/dl_venv_python3/lib/python3.6/site-packages/torch/utils/cpp_extension.py in _write_ninja_file_and_build(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda)
    861     if verbose:
    862         print('Building extension module {}...'.format(name))
--> 863     _build_extension_module(name, build_directory, verbose)
    864 
    865 

~/dl_venv_python3/lib/python3.6/site-packages/torch/utils/cpp_extension.py in _build_extension_module(name, build_directory, verbose)
    957         if hasattr(error, 'output') and error.output:
    958             message += ": {}".format(str(error.output))
--> 959         raise RuntimeError(message)
    960 
    961 

RuntimeError: Error building extension 'enclib_cpu': b'[1/6] c++ -MMD -MF operator.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/operator.cpp -o operator.o\nFAILED: operator.o \nc++ -MMD -MF operator.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/operator.cpp -o operator.o\nIn file included from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/Device.h:3:0,\n                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/python.h:8,\n                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/extension.h:6,\n                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/torch.h:6,\n                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/operator.h:1,\n                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/operator.cpp:1:\n/home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/python_headers.h:9:20: fatal error: Python.h: No such file or directory\ncompilation terminated.\n[2/6] c++ -MMD -MF syncbn_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/syncbn_cpu.cpp -o syncbn_cpu.o\nFAILED: syncbn_cpu.o \nc++ -MMD -MF syncbn_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/syncbn_cpu.cpp -o syncbn_cpu.o\nIn file included from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/Device.h:3:0,\n                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/python.h:8,\n                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/extension.h:6,\n                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/syncbn_cpu.cpp:1:\n/home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/python_headers.h:9:20: fatal error: Python.h: No such file or directory\ncompilation terminated.\n[3/6] c++ -MMD -MF nms_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/nms_cpu.cpp -o nms_cpu.o\nFAILED: nms_cpu.o \nc++ -MMD -MF nms_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/nms_cpu.cpp -o nms_cpu.o\nIn file included from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/Device.h:3:0,\n                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/python.h:8,\n                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/extension.h:6,\n                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/nms_cpu.cpp:1:\n/home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/python_headers.h:9:20: fatal error: Python.h: No such file or directory\ncompilation terminated.\n[4/6] c++ -MMD -MF roi_align_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp -o roi_align_cpu.o\nFAILED: roi_align_cpu.o \nc++ -MMD -MF roi_align_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp -o roi_align_cpu.o\nIn file included from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/Device.h:3:0,\n                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/api/include/torch/python.h:8,\n                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/extension.h:6,\n                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:1:\n/home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/python_headers.h:9:20: fatal error: Python.h: No such file or directory\ncompilation terminated.\n[5/6] c++ -MMD -MF encoding_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/encoding_cpu.cpp -o encoding_cpu.o\nninja: build stopped: subcommand failed.\n'

Any help would be greatly appreciated! (Ubuntu 16.04, CUDA 10.0, pytorch version is nightly (from today)) Best, Fabian

zilongzhong commented 5 years ago

Make sure you have put cuda9.2 in your $CUDA_HOME, then use python 3.6 instead of 3.7. For example: conda install python=3.6.7

On Wed, 30 Jan 2019 at 04:42, saikatkundu36 notifications@github.com wrote:

Hi

I am also facing similar issue

(shelfnet) Saikat@Ganga:/doc_image/ShelfNet/experiments/segmentation$ python train.py --diflr False --backbone resnet50 --dataset citys --checkname ShelfNet50_citys_coarse --lr-schedule step --batch-size 1 --epochs 1 ['../../', '/home/Saikat/doc_image/ShelfNet/experiments/segmentation', '/home/Saikat/miniconda3/envs/shelfnet/lib/python37.zip', '/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7', '/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/lib-dynload', '/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages', '/home/Saikat/doc_image/ShelfNet'] Traceback (most recent call last): File "train.py", line 23, in import encoding.utils as utils File "../../encoding/init.py", line 12, in from .version import version ModuleNotFoundError: No module named 'encoding.version' (shelfnet) Saikat@Ganga:/doc_image/ShelfNet/experiments/segmentation$ (shelfnet) Saikat@Ganga:/doc_image/ShelfNet/experiments/segmentation$ (shelfnet) Saikat@Ganga:/doc_image/ShelfNet/experiments/segmentation$ python train.py --diflr False --backbone resnet50 --dataset citys --checkname ShelfNet50_citys_coarse --lr-schedule step --batch-size 1 --epochs 1 ['../../', '/home/Saikat/doc_image/ShelfNet/experiments/segmentation', '/home/Saikat/miniconda3/envs/shelfnet/lib/python37.zip', '/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7', '/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/lib-dynload', '/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages', '/home/Saikat/doc_image/ShelfNet'] Traceback (most recent call last): File "/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 946, in _build_extension_module check=True) File "/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/subprocess.py", line 487, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "train.py", line 23, in import encoding.utils as utils File "../../encoding/init.py", line 13, in from . import nn, functions, dilated, parallel, utils, models, datasets File "../../encoding/nn/init.py", line 12, in from .encoding import File "../../encoding/nn/encoding.py", line 18, in from ..functions import scaledL2, aggregate, pairwise_cosine File "../../encoding/functions/init.py", line 2, in from .encoding import File "../../encoding/functions/encoding.py", line 14, in from .. import lib File "../../encoding/lib/init.py", line 12, in ], build_directory=cpu_path, verbose=False) File "/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 645, in load is_python_module) File "/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 814, in _jit_compile with_cuda=with_cuda) File "/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 863, in _write_ninja_file_and_build _build_extension_module(name, build_directory, verbose) File "/home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 959, in

build_extension_module raise RuntimeError(message) RuntimeError: Error building extension 'enclib_cpu': [1/3] c++ -MMD -MF roi_align_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/TH -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/THC -isystem /home/Saikat/miniconda3/envs/shelfnet/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp -o roi_align_cpu.o FAILED: roi_align_cpu.o c++ -MMD -MF roi_align_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/TH -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/THC -isystem /home/Saikat/miniconda3/envs/shelfnet/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp -o roi_align_cpu.o /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp: In function ‘at::Tensor ROIAlignForwardCPU(const at::Tensor&, const at::Tensor&, int64_t, int64_t, double, int64_t)’: /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:407:30: error: ‘struct at::Type’ has no member named ‘tensor’; did you mean ‘renorm’? auto output = input.type().tensor({num_rois, channels, pooled_height, pooled_width}); ^~ In file included from /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/ATen/ATen.h:9:0, from /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:1: /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function: /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:425:27: error: expected primary-expression before ‘>’ token output.data()); ^ /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:425:29: error: expected primary-expression before ‘)’ token output.data()); ^ /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function: /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:425:27: error: expected primary-expression before ‘>’ token output.data()); ^ /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:425:29: error: expected primary-expression before ‘)’ token output.data()); ^ /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp: In function ‘at::Tensor ROIAlignBackwardCPU(const at::Tensor&, const at::Tensor&, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, double, int64_t)’: /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:454:37: error: ‘struct at::Type’ has no member named ‘tensor’; did you mean ‘renorm’? auto grad_in = bottom_rois.type().tensor({b_size, channels, height, width}).zero(); ^~ In file included from /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/ATen/ATen.h:9:0, from /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:1: /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function: /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:470:28: error: expected primary-expression before ‘>’ token grad_in.data(), ^ /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:470:30: error: expected primary-expression before ‘)’ token grad_in.data(), ^ /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp: In lambda function: /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:470:28: error: expected primary-expression before ‘>’ token grad_in.data(), ^ /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align_cpu.cpp:470:30: error: expected primary-expression before ‘)’ token grad_in.data(), ^ [2/3] c++ -MMD -MF roi_align.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/TH -isystem /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/THC -isystem /home/Saikat/miniconda3/envs/shelfnet/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align.cpp -o roi_align.o In file included from /home/Saikat/doc_image/ShelfNet/encoding/lib/cpu/roi_align.cpp:1:0: /home/Saikat/miniconda3/envs/shelfnet/lib/python3.7/site-packages/torch/lib/include/torch/csrc/api/include/torch/torch.h:7:2: warning: #warning "Including torch/torch.h for C++ extensions is deprecated. Please include torch/extension.h" [-Wcpp]

warning

^~~ ninja: build stopped: subcommand failed.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/zhanghang1989/PyTorch-Encoding/issues/162#issuecomment-458877917, or mute the thread https://github.com/notifications/unsubscribe-auth/APQzzl2C3QdPOmhAqf6AZn8WJgXToTP3ks5vIWj0gaJpZM4ZgRkN .

FabianIsensee commented 5 years ago

Hi, thanks for your reply! Unfortunately, this did not solve my problem:

(dl_venv_python3) fabian@E132-zelos:~$ ipython
Python 3.6.3 (default, Oct  6 2017, 08:44:35) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.2.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import encoding                                                                                                                                                                                                                       
---------------------------------------------------------------------------
CalledProcessError                        Traceback (most recent call last)
~/dl_venv_python3/lib/python3.6/site-packages/torch/utils/cpp_extension.py in _build_extension_module(name, build_directory, verbose)
    945                 cwd=build_directory,
--> 946                 check=True)
    947         else:

/usr/lib/python3.6/subprocess.py in run(input, timeout, check, *popenargs, **kwargs)
    417             raise CalledProcessError(retcode, process.args,
--> 418                                      output=stdout, stderr=stderr)
    419     return CompletedProcess(process.args, retcode, stdout, stderr)

CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

RuntimeError                              Traceback (most recent call last)
<ipython-input-1-4f6e8fe7bd37> in <module>
----> 1 import encoding

~/dl_venv_python3/lib/python3.6/site-packages/encoding/__init__.py in <module>
     11 """An optimized PyTorch package with CUDA backend."""
     12 from .version import __version__
---> 13 from . import nn, functions, parallel, utils, models, datasets, transforms

~/dl_venv_python3/lib/python3.6/site-packages/encoding/nn/__init__.py in <module>
     10 
     11 """Encoding NN Modules"""
---> 12 from .encoding import *
     13 from .syncbn import *
     14 from .customize import *

~/dl_venv_python3/lib/python3.6/site-packages/encoding/nn/encoding.py in <module>
     16 from torch.nn.modules.utils import _pair
     17 
---> 18 from ..functions import scaled_l2, aggregate, pairwise_cosine
     19 
     20 __all__ = ['Encoding', 'EncodingDrop', 'Inspiration', 'UpsampleConv2d']

~/dl_venv_python3/lib/python3.6/site-packages/encoding/functions/__init__.py in <module>
      1 """Encoding Autograd Fuctions"""
----> 2 from .encoding import *
      3 from .syncbn import *
      4 from .customize import *

~/dl_venv_python3/lib/python3.6/site-packages/encoding/functions/encoding.py in <module>
     12 from torch.autograd import Function, Variable
     13 import torch.nn.functional as F
---> 14 from .. import lib
     15 
     16 __all__ = ['aggregate', 'scaled_l2', 'pairwise_cosine']

~/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/__init__.py in <module>
     13         os.path.join(cpu_path, 'roi_align_cpu.cpp'),
     14         os.path.join(cpu_path, 'nms_cpu.cpp'),
---> 15     ], build_directory=cpu_path, verbose=False)
     16 
     17 if torch.cuda.is_available():

~/dl_venv_python3/lib/python3.6/site-packages/torch/utils/cpp_extension.py in load(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module)
    643         verbose,
    644         with_cuda,
--> 645         is_python_module)
    646 
    647 

~/dl_venv_python3/lib/python3.6/site-packages/torch/utils/cpp_extension.py in _jit_compile(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module)
    812                     build_directory=build_directory,
    813                     verbose=verbose,
--> 814                     with_cuda=with_cuda)
    815             finally:
    816                 baton.release()

~/dl_venv_python3/lib/python3.6/site-packages/torch/utils/cpp_extension.py in _write_ninja_file_and_build(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda)
    861     if verbose:
    862         print('Building extension module {}...'.format(name))
--> 863     _build_extension_module(name, build_directory, verbose)
    864 
    865 

~/dl_venv_python3/lib/python3.6/site-packages/torch/utils/cpp_extension.py in _build_extension_module(name, build_directory, verbose)
    957         if hasattr(error, 'output') and error.output:
    958             message += ": {}".format(error.output.decode())
--> 959         raise RuntimeError(message)
    960 
    961 

RuntimeError: Error building extension 'enclib_cpu': [1/5] c++ -MMD -MF syncbn_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/syncbn_cpu.cpp -o syncbn_cpu.o
FAILED: syncbn_cpu.o 
c++ -MMD -MF syncbn_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/syncbn_cpu.cpp -o syncbn_cpu.o
In file included from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include/torch/python.h:6:0,
                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/extension.h:6,
                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/syncbn_cpu.cpp:1:
/home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/python_headers.h:9:20: fatal error: Python.h: No such file or directory
compilation terminated.
[2/5] c++ -MMD -MF roi_align_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp -o roi_align_cpu.o
FAILED: roi_align_cpu.o 
c++ -MMD -MF roi_align_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp -o roi_align_cpu.o
In file included from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include/torch/python.h:6:0,
                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/extension.h:6,
                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:1:
/home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/python_headers.h:9:20: fatal error: Python.h: No such file or directory
compilation terminated.
[3/5] c++ -MMD -MF operator.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/operator.cpp -o operator.o
FAILED: operator.o 
c++ -MMD -MF operator.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/operator.cpp -o operator.o
In file included from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include/torch/python.h:6:0,
                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/extension.h:6,
                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include/torch/torch.h:6,
                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/operator.h:1,
                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/operator.cpp:1:
/home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/python_headers.h:9:20: fatal error: Python.h: No such file or directory
compilation terminated.
[4/5] c++ -MMD -MF nms_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/nms_cpu.cpp -o nms_cpu.o
FAILED: nms_cpu.o 
c++ -MMD -MF nms_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/nms_cpu.cpp -o nms_cpu.o
In file included from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include/torch/python.h:6:0,
                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/extension.h:6,
                 from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/nms_cpu.cpp:1:
/home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/python_headers.h:9:20: fatal error: Python.h: No such file or directory
compilation terminated.
ninja: build stopped: subcommand failed.

In [2]:    
(dl_venv_python3) fabian@E132-zelos:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Tue_Jun_12_23:07:04_CDT_2018
Cuda compilation tools, release 9.2, V9.2.148
(dl_venv_python3) fabian@E132-zelos:~$

I am on cuda-9.2 now. Python is 3.6.3.

The problem seems to be that for whatever reason it does not find Python.h: /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/python_headers.h:9:20: fatal error: Python.h: No such file or directory

I have the python-dev packages installed and my Python.h is located at /usr/include/python3.6m/Python.h. Do I need to add this folder to some envorinment variable?

Best, Fabian

zilongzhong commented 5 years ago

Can you try python3.6.7 and show me the result of $CUDA_HOME?

On Mon, 18 Feb 2019 at 11:13, Fabian Isensee notifications@github.com wrote:

Hi, thanks for your reply! Unfortunately, this did not solve my problem:

(dl_venv_python3) fabian@E132-zelos:~$ ipython Python 3.6.3 (default, Oct 6 2017, 08:44:35) Type 'copyright', 'credits' or 'license' for more information IPython 7.2.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import encoding

CalledProcessError Traceback (most recent call last) ~/dl_venv_python3/lib/python3.6/site-packages/torch/utils/cpp_extension.py in _build_extension_module(name, build_directory, verbose) 945 cwd=build_directory, --> 946 check=True) 947 else:

/usr/lib/python3.6/subprocess.py in run(input, timeout, check, *popenargs, **kwargs) 417 raise CalledProcessError(retcode, process.args, --> 418 output=stdout, stderr=stderr) 419 return CompletedProcess(process.args, retcode, stdout, stderr)

CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

RuntimeError Traceback (most recent call last)

in ----> 1 import encoding ~/dl_venv_python3/lib/python3.6/site-packages/encoding/__init__.py in 11 """An optimized PyTorch package with CUDA backend.""" 12 from .version import __version__ ---> 13 from . import nn, functions, parallel, utils, models, datasets, transforms ~/dl_venv_python3/lib/python3.6/site-packages/encoding/nn/__init__.py in 10 11 """Encoding NN Modules""" ---> 12 from .encoding import * 13 from .syncbn import * 14 from .customize import * ~/dl_venv_python3/lib/python3.6/site-packages/encoding/nn/encoding.py in 16 from torch.nn.modules.utils import _pair 17 ---> 18 from ..functions import scaled_l2, aggregate, pairwise_cosine 19 20 __all__ = ['Encoding', 'EncodingDrop', 'Inspiration', 'UpsampleConv2d'] ~/dl_venv_python3/lib/python3.6/site-packages/encoding/functions/__init__.py in 1 """Encoding Autograd Fuctions""" ----> 2 from .encoding import * 3 from .syncbn import * 4 from .customize import * ~/dl_venv_python3/lib/python3.6/site-packages/encoding/functions/encoding.py in 12 from torch.autograd import Function, Variable 13 import torch.nn.functional as F ---> 14 from .. import lib 15 16 __all__ = ['aggregate', 'scaled_l2', 'pairwise_cosine'] ~/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/__init__.py in 13 os.path.join(cpu_path, 'roi_align_cpu.cpp'), 14 os.path.join(cpu_path, 'nms_cpu.cpp'), ---> 15 ], build_directory=cpu_path, verbose=False) 16 17 if torch.cuda.is_available(): ~/dl_venv_python3/lib/python3.6/site-packages/torch/utils/cpp_extension.py in load(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module) 643 verbose, 644 with_cuda, --> 645 is_python_module) 646 647 ~/dl_venv_python3/lib/python3.6/site-packages/torch/utils/cpp_extension.py in _jit_compile(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module) 812 build_directory=build_directory, 813 verbose=verbose, --> 814 with_cuda=with_cuda) 815 finally: 816 baton.release() ~/dl_venv_python3/lib/python3.6/site-packages/torch/utils/cpp_extension.py in _write_ninja_file_and_build(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda) 861 if verbose: 862 print('Building extension module {}...'.format(name)) --> 863 _build_extension_module(name, build_directory, verbose) 864 865 ~/dl_venv_python3/lib/python3.6/site-packages/torch/utils/cpp_extension.py in _build_extension_module(name, build_directory, verbose) 957 if hasattr(error, 'output') and error.output: 958 message += ": {}".format(error.output.decode()) --> 959 raise RuntimeError(message) 960 961 RuntimeError: Error building extension 'enclib_cpu': [1/5] c++ -MMD -MF syncbn_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/syncbn_cpu.cpp -o syncbn_cpu.o FAILED: syncbn_cpu.o c++ -MMD -MF syncbn_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/syncbn_cpu.cpp -o syncbn_cpu.o In file included from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include/torch/python.h:6:0, from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/extension.h:6, from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/syncbn_cpu.cpp:1: /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/python_headers.h:9:20: fatal error: Python.h: No such file or directory compilation terminated. [2/5] c++ -MMD -MF roi_align_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp -o roi_align_cpu.o FAILED: roi_align_cpu.o c++ -MMD -MF roi_align_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp -o roi_align_cpu.o In file included from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include/torch/python.h:6:0, from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/extension.h:6, from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/roi_align_cpu.cpp:1: /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/python_headers.h:9:20: fatal error: Python.h: No such file or directory compilation terminated. [3/5] c++ -MMD -MF operator.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/operator.cpp -o operator.o FAILED: operator.o c++ -MMD -MF operator.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/operator.cpp -o operator.o In file included from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include/torch/python.h:6:0, from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/extension.h:6, from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include/torch/torch.h:6, from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/operator.h:1, from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/operator.cpp:1: /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/python_headers.h:9:20: fatal error: Python.h: No such file or directory compilation terminated. [4/5] c++ -MMD -MF nms_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/nms_cpu.cpp -o nms_cpu.o FAILED: nms_cpu.o c++ -MMD -MF nms_cpu.o.d -DTORCH_EXTENSION_NAME=enclib_cpu -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/TH -isystem /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/THC -isystem /home/fabian/dl_venv_python3/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/nms_cpu.cpp -o nms_cpu.o In file included from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include/torch/python.h:6:0, from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/extension.h:6, from /home/fabian/dl_venv_python3/lib/python3.6/site-packages/encoding/lib/cpu/nms_cpu.cpp:1: /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/python_headers.h:9:20: fatal error: Python.h: No such file or directory compilation terminated. ninja: build stopped: subcommand failed. In [2]: (dl_venv_python3) fabian@E132-zelos:~$ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2018 NVIDIA Corporation Built on Tue_Jun_12_23:07:04_CDT_2018 Cuda compilation tools, release 9.2, V9.2.148 (dl_venv_python3) fabian@E132-zelos:~$ I am on cuda-9.2 now. Python is 3.6.3. The problem seems to be that for whatever reason it does not find Python.h: /home/fabian/dl_venv_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/python_headers.h:9:20: fatal error: Python.h: No such file or directory I have the python-dev packages installed and my Python.h is located at /usr/include/python3.6m/Python.h. Do I need to add this folder to some envorinment variable? Best, Fabian — You are receiving this because you commented. Reply to this email directly, view it on GitHub , or mute the thread .
zhanghang1989 commented 5 years ago

I found some recent PyTorch updates broke this package. Please use PyTorch 1.0.0 version.

conda uninstall pytorch
conda install pytorch==1.0.0 -c pytorch
zhanghang1989 commented 5 years ago

Just an update. CUDA toolkit 10.0 fix the problem. That is because of the higher version of pytorch pre-build binary.

LiuPearl1 commented 5 years ago

Haven't try that, maybe you could change environment variables accordingly to achieve it.

thanks for your help! I install another cuda version(9.2) on the server, which can kill this bug.

Hi, I want to know how you solve this problem finally. You reinstall cuda9.2 in your system or virtual environment? I have been troubled by this for a long time. Thanks.

zilongzhong commented 5 years ago

An anaconda virtual environment.

Best regards, Zilong

On Apr 10, 2019, at 2:11 AM, LiuPearl1 notifications@github.com wrote:

Haven't try that, maybe you could change environment variables accordingly to achieve it.

thanks for your help! I install another cuda version(9.2) on the server, which can kill this bug.

Hi, I want to know how you solve this problem finally. You reinstall cuda9.2 in your system or virtual environment? I have been troubled by this for a long time. Thanks.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

zihao-lu commented 5 years ago

Using cuda9.2 can kill this bug and make sure your environment path is correct.

this solution worked for me, ennnnn, can you tell why it worked, it seemed like that cuda9.0 mismatched the pybind11 and encoding. Thx~

zihao-lu commented 5 years ago

And i'd like to share my environment, maybe this can help someone; Ubuntu 16.04 Anaconda3-5.2.0 python 3.6.7 cuda 9.2 cudnn 7.3.1 pytorch v1.0.0 from source install dependencies followed this page git checkout v1.0.0 git submodule update --init --recursive cd third_party/ideep/mkl-dnn git checkout v0.18.1 cp mkldnn_version.h.in mkldnn_version.h cd - python setup.py install

zilongzhong commented 5 years ago

Likely, according to my experience, the reason lies in package compatibility. For example, Cuda9.0 works well for Pytorch 0.4, and cuda9.2/10 work for Pytorch 1.0.

Best regards, Zilong

On May 18, 2019, at 12:39 AM, littleeye notifications@github.com wrote:

Using cuda9.2 can kill this bug and make sure your environment path is correct.

this solution worked for me, ennnnn, can you tell why it worked, it seemed like that cuda9.0 mismatched the pybind11 and encoding. Thx~

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

henryace commented 5 years ago

Thank you all. I solved this problem finally. Python:3.6.7 cuda 10.0 pytorch 1.0.0