pytorch / extension-cpp

C++ extensions in PyTorch
1.02k stars 214 forks source link

Compilation issues on OS X #1

Closed tsrxq closed 6 years ago

tsrxq commented 6 years ago

Setup: Latest OS X and pytorch built from master (no GPU).

First error comes from torch.cuda not being present: https://github.com/pytorch/pytorch/blob/abd8501020d16e9aa12fa60dfd38ed70b8d7b71e/torch/utils/cpp_extension.py#L45. I manually set it to None.

The next one is related to flags, if I try: python setup.py install I get the following error:

fatal error: 'atomic' file not found
#include <atomic>
         ^~~~~~~~
1 error generated.
error: command 'gcc' failed with exit status 1

That can be fixed by passing: CFLAGS='-stdlib=libc++'.

Next problem comes when I try to import:

ImportError: dlopen(/Users/michael/miniconda3/lib/python3.6/site-packages/lltm_cpp-0.0.0-py3.6-macosx-10.7-x86_64.egg/lltm_cpp.cpython-36m-darwin.so, 2): Symbol not found: _THPVariableClass
  Referenced from: /Users/michael/miniconda3/lib/python3.6/site-packages/lltm_cpp-0.0.0-py3.6-macosx-10.7-x86_64.egg/lltm_cpp.cpython-36m-darwin.so
  Expected in: flat namespace
 in /Users/michael/miniconda3/lib/python3.6/site-packages/lltm_cpp-0.0.0-py3.6-macosx-10.7-x86_64.egg/lltm_cpp.cpython-36m-darwin.so
tsrxq commented 6 years ago

Last error can be fixed by importing torch first:

import torch 
import lltm_cpp
goldsborough commented 6 years ago
  1. Where does the first error come from/appear? I'm not sure how torch.cuda cannot be present. It's a subpackage of the torch package and is always present
  2. I have successfully built these extensions on macOS with system clang and with gcc-7, so I cannot reproduce your compiler errors. It seems like an issue with your gcc.
  3. This is described in the tutorial, once it gets published.
tsrxq commented 6 years ago
  1. My branch got heavily messed up, synchronising with master fixed issue.
  2. Switching to gcc/g++ helps. It turned out gcc/g++ is advised in tutorial

Closing it, thanks for help!

rusty1s commented 6 years ago

Sorry for bothering you again and for reopening this ticket.

I tried to get CUDAExtension to run on macOS and ran into some trouble, so maybe you can help me (it seems that you are also on a mac?).

You wrote in your tutorial:

On MacOS, you will have to download GCC (e.g. brew install gcc will give you GCC 7 at the time of this writing). In the worst case, you can build PyTorch from source with your compiler and then build the extension with that same compiler.

Compiling PyTorch with clang and clang++ works fine, however it fails when using gcc-7 and g++-7, and prints the following error message:

nvcc fatal: GNU C/C++ compiler is no longer supported as a host compiler on Mac OS X.

After browsing the internet, these error messages make sense to me, as it seems that nvcc does not support gcc/g++.

In addition, using clang/clang++ or gcc-7/g++-7 to compile the CUDAExtension do not work either.

Using clang/clang++, I get the error:

fatal error: 'atomic' file not found

Using gcc-7/g++-7, I get the error:

/usr/local/cuda/include/crt/math_functions.h(9457): error: namespace "__gnu_cxx" has no member "__promote_2"

Any idea on how to solve this problem?

apaszke commented 6 years ago

For the first error, you simply can't use gcc-7 to compile PyTorch on Mac, because the CUDA toolkit on mac doesn't support it as a compiler.

The second one looks like a bug, cc @goldsborough.

goldsborough commented 6 years ago

Yeah we actually build PyTorch with clang on Mac, so my tutorial is wrong here. But for the issue, we actually apparently don't support CUDA on Mac, so all bets are off for this unfortunately :/

rusty1s commented 6 years ago

Just wanted to let you know that I fixed the problem. The actual problem is not within PyTorch and CUDA on macOS, but with distutils (https://github.com/cudamat/cudamat/issues/39).

I noticed that the nvcc call works fine when I type it manually into the console, but fails when calling via spawn. I fixed the problem by replacing

def spawn(self, cmd):
    spawn(cmd, dry_run=self.dry_run)

with

def spawn(self, cmd):
     subprocess.call(cmd)

in python3.6/distutils/ccompiler.py. If you know of a more elegant fix, please let me know.