pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
https://pytorch.org
Other
82.79k stars 22.32k forks source link

Error building from source on macOS #5244

Closed rudedogg closed 4 years ago

rudedogg commented 6 years ago

What I'm trying to do

Build pytorch from source in a conda env, so I have cuda support on macOS. I'm using fish shell.

conda activate fastai
xcode-select --switch /Library/Developer/CommandLineTools
env CMAKE_PREFIX_PATH=/usr/local/anaconda3 MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python setup.py install

What goes wrong

clang -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -I/usr/local/anaconda3/envs/fastai/include -arch x86_64 -I/usr/local/anaconda3/envs/fastai/include -arch x86_64 -I/usr/local/cuda/include -I/Users/rudedogg/Development/Contrib/pytorch -I/Users/rudedogg/Development/Contrib/pytorch/torch/csrc -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/pybind11/include -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/TH -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/THNN -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/ATen -I/usr/local/anaconda3/envs/fastai/lib/python3.6/site-packages/numpy/core/include -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/THD -I/usr/local/cuda/include -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/THCUNN -I/usr/local/anaconda3/envs/fastai/include/python3.6m -c /Users/rudedogg/Development/Contrib/pytorch/torch/csrc/generated/cuda_TensorDouble.cpp -o build/temp.macosx-10.7-x86_64-3.6/Users/rudedogg/Development/Contrib/pytorch/torch/csrc/generated/cuda_TensorDouble.o -D_THP_CORE -std=c++11 -Wno-write-strings -fno-strict-aliasing -Wno-missing-braces -DWITH_NUMPY -DWITH_DISTRIBUTED -DWITH_CUDA -DCUDA_LIB_PATH=/usr/local/cuda/lib -DWITH_CUDNN -DWITH_SCALARS
clang: error: unable to execute command: Segmentation fault: 11
clang: error: clang frontend command failed due to signal (use -v to see invocation)
Apple LLVM version 8.1.0 (clang-802.0.42)
Target: x86_64-apple-darwin17.4.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

Let me know if I can provide more information.

rudedogg commented 6 years ago

Made a few more attempts with some config changes.

Command

Using Xcode 8 build tools

env CMAKE_PREFIX_PATH=/usr/local/anaconda3 MACOSX_DEPLOYMENT_TARGET=10.13 CC=clang CXX=clang++ CXXFLAGS=-stdlib=libc++ python setup.py install

Error

clang -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -I/usr/local/anaconda3/envs/fastai/include -arch x86_64 -I/usr/local/anaconda3/envs/fastai/include -arch x86_64 -I/usr/local/cuda/include -I/Users/rudedogg/Development/Contrib/pytorch -I/Users/rudedogg/Development/Contrib/pytorch/torch/csrc -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/pybind11/include -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/TH -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/THNN -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/ATen -I/usr/local/anaconda3/envs/fastai/lib/python3.6/site-packages/numpy/core/include -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/THD -I/usr/local/cuda/include -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/THCUNN -I/usr/local/anaconda3/envs/fastai/include/python3.6m -c torch/csrc/cuda/AutoGPU.cpp -o build/temp.macosx-10.7-x86_64-3.6/torch/csrc/cuda/AutoGPU.o -D_THP_CORE -std=c++11 -Wno-write-strings -fno-strict-aliasing -Wno-missing-braces -DWITH_NUMPY -DWITH_DISTRIBUTED -DWITH_CUDA -DCUDA_LIB_PATH=/usr/local/cuda/lib -DWITH_CUDNN -DWITH_SCALARS
clang: error: unable to execute command: Segmentation fault: 11
clang: error: clang frontend command failed due to signal (use -v to see invocation)
Apple LLVM version 8.1.0 (clang-802.0.42)
Target: x86_64-apple-darwin17.4.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
clang: note: diagnostic msg: PLEASE submit a bug report to http://developer.apple.com/bugreporter/ and include the crash backtrace, preprocessed source, and associated run script.
clang -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -I/usr/local/anaconda3/envs/fastai/include -arch x86_64 -I/usr/local/anaconda3/envs/fastai/include -arch x86_64 -I/usr/local/cuda/include -I/Users/rudedogg/Development/Contrib/pytorch -I/Users/rudedogg/Development/Contrib/pytorch/torch/csrc -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/pybind11/include -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/TH -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/THNN -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/ATen -I/usr/local/anaconda3/envs/fastai/lib/python3.6/site-packages/numpy/core/include -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/THD -I/usr/local/cuda/include -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/THCUNN -I/usr/local/anaconda3/envs/fastai/include/python3.6m -c torch/csrc/cuda/python_comm.cpp -o build/temp.macosx-10.7-x86_64-3.6/torch/csrc/cuda/python_comm.o -D_THP_CORE -std=c++11 -Wno-write-strings -fno-strict-aliasing -Wno-missing-braces -DWITH_NUMPY -DWITH_DISTRIBUTED -DWITH_CUDA -DCUDA_LIB_PATH=/usr/local/cuda/lib -DWITH_CUDNN -DWITH_SCALARS
clang: note: diagnostic msg:
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang: note: diagnostic msg: /var/folders/b7/fqvwqsh503zbn1_xj3qlwh0h0000gn/T/python_nn_functions-5cabcd.cpp
clang: note: diagnostic msg: /var/folders/b7/fqvwqsh503zbn1_xj3qlwh0h0000gn/T/python_nn_functions-5cabcd.sh
clang: note: diagnostic msg: Crash backtrace is located in
clang: note: diagnostic msg: /Users/rudedogg/Library/Logs/DiagnosticReports/clang_<YYYY-MM-DD-HHMMSS>_<hostname>.crash
clang: note: diagnostic msg: (choose the .crash file that corresponds to your crash)
clang: note: diagnostic msg:

Crash Log

Process:               clang [21312]
Path:                  /Library/Developer/CommandLineTools/usr/bin/clang
Identifier:            clang
Version:               8.1.0 (802.0.42)
Code Type:             X86-64 (Native)
Parent Process:        clang [21311]
Responsible:           clang [21312]
User ID:               501

Date/Time:             2018-02-14 13:21:34.467 -0700
OS Version:            Mac OS X 10.13.3 (17D47)
Report Version:        12
Anonymous UUID:        C9E0E8D3-1D83-174F-5E70-ABDE4079ED63

Time Awake Since Boot: 2500 seconds

System Integrity Protection: enabled

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       EXC_I386_GPFLT
Exception Note:        EXC_CORPSE_NOTIFY

Termination Signal:    Segmentation fault: 11
Termination Reason:    Namespace SIGNAL, Code 0xb
Terminating Process:   exc handler [0]

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   clang                           0x0000000104423388 unsigned int llvm::DFSPass<llvm::GraphTraits<llvm::MachineBasicBlock*> >(llvm::DominatorTreeBase<llvm::GraphTraits<llvm::MachineBasicBlock*>::NodeType>&, llvm::GraphTraits<llvm::MachineBasicBlock*>::NodeType*, unsigned int) + 312
1   clang                           0x00000001044226d7 void llvm::Calculate<llvm::MachineFunction, llvm::MachineBasicBlock*>(llvm::DominatorTreeBase<llvm::GraphTraits<llvm::MachineBasicBlock*>::NodeType>&, llvm::MachineFunction&) + 2167
2   clang                           0x0000000104421b12 void llvm::DominatorTreeBase<llvm::MachineBasicBlock>::recalculate<llvm::MachineFunction>(llvm::MachineFunction&) + 114
3   clang                           0x0000000104421a4f llvm::MachineDominatorTree::runOnMachineFunction(llvm::MachineFunction&) + 63
4   clang                           0x0000000103e833f0 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) + 112
5   clang                           0x0000000103e4e04e llvm::FPPassManager::runOnFunction(llvm::Function&) + 574
6   clang                           0x0000000103e6cc73 llvm::FPPassManager::runOnModule(llvm::Module&) + 51
7   clang                           0x0000000103e52709 llvm::legacy::PassManagerImpl::run(llvm::Module&) + 489
8   clang                           0x0000000103e2799d clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*, clang::BackendAction, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream> >) + 5261
9   clang                           0x0000000103dff9b9 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) + 393
10  clang                           0x0000000103b9b279 clang::ParseAST(clang::Sema&, bool, bool) + 265
11  clang                           0x0000000103b98845 clang::FrontendAction::Execute() + 37
12  clang                           0x0000000103b5c844 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) + 292
13  clang                           0x0000000103b5a913 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) + 3107
14  clang                           0x0000000103b29196 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) + 3510
15  clang                           0x0000000103b24c5b main + 9051
16  libdyld.dylib                   0x00007fff7bf93115 start + 1

Thread 0 crashed with X86 Thread State (64-bit):
  rax: 0x00007fd25695d780  rbx: 0x0000000000000000  rcx: 0xf26b5b00007fd259  rdx: 0x0000000000000089
  rdi: 0x0000000000000600  rsi: 0x0000000000000200  rbp: 0x00007ffeec0ea420  rsp: 0x00007ffeec0ea1a0
   r8: 0x0000000000000001   r9: 0x00007fd25695be00  r10: 0x00007ffeec0ea1c0  r11: 0x0000000000000000
  r12: 0x00007fd2590c16d0  r13: 0x00007fd253ae3f9b  r14: 0x0000000000000088  r15: 0xf26b5b00007fd259
  rip: 0x0000000104423388  rfl: 0x0000000000010246  cr2: 0x0000000105282190

Logical CPU:     8
Error Code:      0x00000000
Trap Number:     13

Binary Images:
       0x103b12000 -        0x1069f5ffb +clang (8.1.0 - 802.0.42) <CB7A9FD1-3417-3B9F-93BC-9D06FCC0DB00> /Library/Developer/CommandLineTools/usr/bin/clang
       0x10bca1000 -        0x10bceb98f  dyld (519.2.2) <6695F30B-4E88-3C0B-9867-7D738C44A3E6> /usr/lib/dyld
    0x7fff7987c000 -     0x7fff798affff  libclosured.dylib (519.2.2) <48051216-5647-3643-B979-B77D0FD20011> /usr/lib/closure/libclosured.dylib
    0x7fff79d8e000 -     0x7fff79d8fff3  libSystem.B.dylib (1252) <47329E26-DC23-3EBA-9461-37755368327D> /usr/lib/libSystem.B.dylib
    0x7fff79fc2000 -     0x7fff7a018fff  libc++.1.dylib (400.9) <FCF5E1F6-2B04-3545-8004-F3AB32FED172> /usr/lib/libc++.1.dylib
    0x7fff7a019000 -     0x7fff7a03dff7  libc++abi.dylib (400.7) <217656D5-BC40-37FF-B322-91CB2AAD4F34> /usr/lib/libc++abi.dylib
    0x7fff7b06f000 -     0x7fff7b09fffb  libncurses.5.4.dylib (53) <030DF747-F71B-367A-83EE-2F30B7947929> /usr/lib/libncurses.5.4.dylib
    0x7fff7b38f000 -     0x7fff7b77d7e7  libobjc.A.dylib (723) <93A92316-DE1E-378C-8891-99720B50D075> /usr/lib/libobjc.A.dylib
    0x7fff7bd7a000 -     0x7fff7bd8cffb  libz.1.dylib (70) <48C67CFC-940D-3857-8DAD-857774605352> /usr/lib/libz.1.dylib
    0x7fff7be2a000 -     0x7fff7be2eff7  libcache.dylib (80) <354F3B7D-404E-3398-9EBF-65CA2CE65211> /usr/lib/system/libcache.dylib
    0x7fff7be2f000 -     0x7fff7be39ff3  libcommonCrypto.dylib (60118.30.2) <674286D3-7744-36A3-9AAA-49DFCD97A986> /usr/lib/system/libcommonCrypto.dylib
    0x7fff7be3a000 -     0x7fff7be41fff  libcompiler_rt.dylib (62) <4487CFBA-A5D7-3282-9E6B-94CAD7BE507E> /usr/lib/system/libcompiler_rt.dylib
    0x7fff7be42000 -     0x7fff7be4affb  libcopyfile.dylib (146.30.2) <2C7C67D7-562B-3FFA-973D-BACF4C10E1EC> /usr/lib/system/libcopyfile.dylib
    0x7fff7be4b000 -     0x7fff7bed0fff  libcorecrypto.dylib (562.30.10) <8A53EFE1-AFCA-3676-BEE1-FA5ED9F0E222> /usr/lib/system/libcorecrypto.dylib
    0x7fff7bf58000 -     0x7fff7bf91ff7  libdispatch.dylib (913.30.4) <7D0E3183-282B-3FEE-A734-2C0ADC092084> /usr/lib/system/libdispatch.dylib
    0x7fff7bf92000 -     0x7fff7bfafff7  libdyld.dylib (519.2.2) <C50D02BC-A333-3313-B787-02F255A6783F> /usr/lib/system/libdyld.dylib
    0x7fff7bfb0000 -     0x7fff7bfb0ffb  libkeymgr.dylib (28) <6D84A96F-C65B-38EC-BDB5-21FD2C97E7B2> /usr/lib/system/libkeymgr.dylib
    0x7fff7bfbe000 -     0x7fff7bfbeff7  liblaunch.dylib (1205.30.29) <E66F58ED-C15E-3DFB-BC22-A861E13918C6> /usr/lib/system/liblaunch.dylib
    0x7fff7bfbf000 -     0x7fff7bfc3ffb  libmacho.dylib (900.0.1) <756F2553-07B6-3B42-ACEA-2F0F1A5E8D0F> /usr/lib/system/libmacho.dylib
    0x7fff7bfc4000 -     0x7fff7bfc6ff3  libquarantine.dylib (86) <6AC8773F-3817-3D82-99C2-01BABB9C3CBB> /usr/lib/system/libquarantine.dylib
    0x7fff7bfc7000 -     0x7fff7bfc8ff3  libremovefile.dylib (45) <912FA211-DD8C-3C92-8424-21B89F8B10FD> /usr/lib/system/libremovefile.dylib
    0x7fff7bfc9000 -     0x7fff7bfe0fff  libsystem_asl.dylib (356.1.1) <94972913-9DF0-3C78-847C-43E58919E3DA> /usr/lib/system/libsystem_asl.dylib
    0x7fff7bfe1000 -     0x7fff7bfe1fff  libsystem_blocks.dylib (67) <F2493BB5-B1C6-3C4D-9F1F-1B402E0F1DB7> /usr/lib/system/libsystem_blocks.dylib
    0x7fff7bfe2000 -     0x7fff7c06bff7  libsystem_c.dylib (1244.30.3) <E0136C71-0648-36F0-9F84-82EA2748A8D7> /usr/lib/system/libsystem_c.dylib
    0x7fff7c06c000 -     0x7fff7c06fffb  libsystem_configuration.dylib (963.30.1) <0F8D0B76-4F7D-34EC-AB6C-50F9465809DA> /usr/lib/system/libsystem_configuration.dylib
    0x7fff7c070000 -     0x7fff7c073ffb  libsystem_coreservices.dylib (51) <21A488D0-2D07-344E-8631-CC8B2A246F35> /usr/lib/system/libsystem_coreservices.dylib
    0x7fff7c074000 -     0x7fff7c075fff  libsystem_darwin.dylib (1244.30.3) <2F750CB1-BC26-3FA3-AE59-553EE30D451B> /usr/lib/system/libsystem_darwin.dylib
    0x7fff7c076000 -     0x7fff7c07cff7  libsystem_dnssd.dylib (878.30.4) <EB9BB165-45A4-367C-B33A-688D4F383A95> /usr/lib/system/libsystem_dnssd.dylib
    0x7fff7c07d000 -     0x7fff7c0c6ff7  libsystem_info.dylib (517.30.1) <7D79E167-4B5C-3833-81EE-3AF3FB53616D> /usr/lib/system/libsystem_info.dylib
    0x7fff7c0c7000 -     0x7fff7c0ecff7  libsystem_kernel.dylib (4570.41.2) <5155A4C3-825B-3178-AC51-0D2D2F2A6618> /usr/lib/system/libsystem_kernel.dylib
    0x7fff7c0ed000 -     0x7fff7c138fcb  libsystem_m.dylib (3146) <ABB1B85F-9FFE-31B8-AD4F-E39A30794A93> /usr/lib/system/libsystem_m.dylib
    0x7fff7c139000 -     0x7fff7c158fff  libsystem_malloc.dylib (140.40.1) <36B22C99-D772-3039-9A4C-AA31389965E1> /usr/lib/system/libsystem_malloc.dylib
    0x7fff7c159000 -     0x7fff7c1fdff3  libsystem_network.dylib (1229.30.11) <40BAD301-8744-3AD8-A688-E7925C587B00> /usr/lib/system/libsystem_network.dylib
    0x7fff7c1fe000 -     0x7fff7c208ffb  libsystem_networkextension.dylib (767.40.1) <CEDC330D-28F0-3902-BEB0-10B92ACEC69F> /usr/lib/system/libsystem_networkextension.dylib
    0x7fff7c209000 -     0x7fff7c212ff3  libsystem_notify.dylib (172) <98EA3D62-7C86-30DE-8261-D020D2F1EFF3> /usr/lib/system/libsystem_notify.dylib
    0x7fff7c213000 -     0x7fff7c21aff7  libsystem_platform.dylib (161.20.1) <C049250F-8C35-314D-810F-4E28AEAED983> /usr/lib/system/libsystem_platform.dylib
    0x7fff7c21b000 -     0x7fff7c226fff  libsystem_pthread.dylib (301.30.1) <ABA848E1-6978-3B42-A3A7-608B2C36FA93> /usr/lib/system/libsystem_pthread.dylib
    0x7fff7c227000 -     0x7fff7c22aff3  libsystem_sandbox.dylib (765.40.2) <922D3D15-AB4C-3F1A-A94F-39214AF1ADB3> /usr/lib/system/libsystem_sandbox.dylib
    0x7fff7c22b000 -     0x7fff7c22cff3  libsystem_secinit.dylib (30) <F06ADB8F-9E94-34A7-B3C9-2C22FDD14BAD> /usr/lib/system/libsystem_secinit.dylib
    0x7fff7c22d000 -     0x7fff7c234ff7  libsystem_symptoms.dylib (820.30.7) <DC3586C2-AA56-3419-88D3-FC0DBF08E3C0> /usr/lib/system/libsystem_symptoms.dylib
    0x7fff7c235000 -     0x7fff7c248ff7  libsystem_trace.dylib (829.30.14) <69EBF017-D40F-30D7-9B0B-BFC862D761A5> /usr/lib/system/libsystem_trace.dylib
    0x7fff7c24a000 -     0x7fff7c24fff7  libunwind.dylib (35.3) <6D4FCD49-D2A9-3233-95C7-A7635CE265F2> /usr/lib/system/libunwind.dylib
    0x7fff7c250000 -     0x7fff7c27cff7  libxpc.dylib (1205.30.29) <F7E5F1BC-614B-39CB-B6CE-92A9C7B7EC0B> /usr/lib/system/libxpc.dylib

External Modification Summary:
  Calls made by other processes targeting this process:
    task_for_pid: 0
    thread_create: 0
    thread_set_state: 0
  Calls made by this process:
    task_for_pid: 0
    thread_create: 0
    thread_set_state: 0
  Calls made by all processes on this machine:
    task_for_pid: 1581
    thread_create: 0
    thread_set_state: 0

VM Region Summary:
ReadOnly portion of Libraries: Total=250.3M resident=0K(0%) swapped_out_or_unallocated=250.3M(100%)
Writable regions: Total=250.2M written=0K(0%) resident=0K(0%) swapped_out=0K(0%) unallocated=250.2M(100%)

                                VIRTUAL   REGION 
REGION TYPE                        SIZE    COUNT (non-coalesced) 
===========                     =======  ======= 
Kernel Alloc Once                    8K        2 
MALLOC                           241.9M       51 
MALLOC guard page                   16K        5 
STACK GUARD                       56.0M        2 
Stack                             8192K        2 
__DATA                            4892K       45 
__LINKEDIT                       194.5M        4 
__TEXT                            55.8M       44 
mapped file                       7692K       69 
shared memory                        8K        3 
===========                     =======  ======= 
TOTAL                            568.6M      217 

Command

Switched back to Xcode 9 build tools. Switched to v0.3.1 tag instead of master

sudo xcode-select --switch /Applications/Xcode.app
env CMAKE_PREFIX_PATH=/usr/local/anaconda3 MACOSX_DEPLOYMENT_TARGET=10.13 CC=clang CXX=clang++ CXXFLAGS=-stdlib=libc++ python setup.py clean
env CMAKE_PREFIX_PATH=/usr/local/anaconda3 MACOSX_DEPLOYMENT_TARGET=10.13 CC=clang CXX=clang++ CXXFLAGS=-stdlib=libc++ python setup.py install

Error

[ 42%] Building NVCC (Device) object CMakeFiles/THC.dir/generated/THC_generated_THCTensorMathReduceChar.cu.o
nvcc error   : 'ptxas' died due to signal 11 (Invalid memory reference)
CMake Error at THC_generated_THCTensorMathCompareTChar.cu.o.cmake:267 (message):
  Error generating file
  /Users/rudedogg/Development/Contrib/pytorch/torch/lib/build/THC/CMakeFiles/THC.dir/generated/./THC_generated_THCTensorMathCompareTChar.cu.o

make[2]: *** [CMakeFiles/THC.dir/generated/THC_generated_THCTensorMathCompareTChar.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/THC.dir/all] Error 2
make: *** [all] Error 2

Command

Xcode 9 build tools. v0.3.1 tag Switched to using 10.9 target

sudo xcode-select --switch /Applications/Xcode.app
env CMAKE_PREFIX_PATH=/usr/local/anaconda3 MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ CXXFLAGS=-stdlib=libc++ python setup.py clean
env CMAKE_PREFIX_PATH=/usr/local/anaconda3 MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ CXXFLAGS=-stdlib=libc++ python setup.py install

Error

clang -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -I/usr/local/anaconda3/envs/fastai/include -arch x86_64 -I/usr/local/anaconda3/envs/fastai/include -arch x86_64 -I/usr/local/cuda/include -I/Users/rudedogg/Development/Contrib/pytorch -I/Users/rudedogg/Development/Contrib/pytorch/torch/csrc -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/pybind11/include -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/TH -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/THNN -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/ATen -I/usr/local/anaconda3/envs/fastai/lib/python3.6/site-packages/numpy/core/include -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/THD -I/usr/local/cuda/include -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/THCUNN -I/usr/local/anaconda3/envs/fastai/include/python3.6m -c torch/csrc/autograd/function.cpp -o build/temp.macosx-10.7-x86_64-3.6/torch/csrc/autograd/function.o -D_THP_CORE -std=c++11 -Wno-write-strings -fno-strict-aliasing -Wno-missing-braces -DWITH_NUMPY -DWITH_DISTRIBUTED -DWITH_CUDA -DCUDA_LIB_PATH=/usr/local/cuda/lib -DWITH_CUDNN
clang: error: unable to execute command: Segmentation fault: 11
clang: error: clang frontend command failed due to signal (use -v to see invocation)
Apple LLVM version 9.0.0 (clang-900.0.39.2)
Target: x86_64-apple-darwin17.4.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
clang: note: diagnostic msg: PLEASE submit a bug report to http://developer.apple.com/bugreporter/ and include the crash backtrace, preprocessed source, and associated run script.
clang -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -I/usr/local/anaconda3/envs/fastai/include -arch x86_64 -I/usr/local/anaconda3/envs/fastai/include -arch x86_64 -I/usr/local/cuda/include -I/Users/rudedogg/Development/Contrib/pytorch -I/Users/rudedogg/Development/Contrib/pytorch/torch/csrc -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/pybind11/include -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/TH -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/THNN -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/ATen -I/usr/local/anaconda3/envs/fastai/lib/python3.6/site-packages/numpy/core/include -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/THD -I/usr/local/cuda/include -I/Users/rudedogg/Development/Contrib/pytorch/torch/lib/tmp_install/include/THCUNN -I/usr/local/anaconda3/envs/fastai/include/python3.6m -c torch/csrc/autograd/profiler.cpp -o build/temp.macosx-10.7-x86_64-3.6/torch/csrc/autograd/profiler.o -D_THP_CORE -std=c++11 -Wno-write-strings -fno-strict-aliasing -Wno-missing-braces -DWITH_NUMPY -DWITH_DISTRIBUTED -DWITH_CUDA -DCUDA_LIB_PATH=/usr/local/cuda/lib -DWITH_CUDNN
clang: note: diagnostic msg:
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang: note: diagnostic msg: /var/folders/b7/fqvwqsh503zbn1_xj3qlwh0h0000gn/T/python_variable-64f2f3.cpp
clang: note: diagnostic msg: /var/folders/b7/fqvwqsh503zbn1_xj3qlwh0h0000gn/T/python_variable-64f2f3.sh
clang: note: diagnostic msg: Crash backtrace is located in
clang: note: diagnostic msg: /Users/rudedogg/Library/Logs/DiagnosticReports/clang_<YYYY-MM-DD-HHMMSS>_<hostname>.crash
clang: note: diagnostic msg: (choose the .crash file that corresponds to your crash)
clang: note: diagnostic msg:

********************

Crash Report

Process:               clang [47212]
Path:                  /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang
Identifier:            clang
Version:               9.0.0 (900.0.39)
Code Type:             X86-64 (Native)
Parent Process:        clang [47210]
Responsible:           clang [47212]
User ID:               501

Date/Time:             2018-02-14 14:07:48.952 -0700
OS Version:            Mac OS X 10.13.3 (17D47)
Report Version:        12
Anonymous UUID:        C9E0E8D3-1D83-174F-5E70-ABDE4079ED63

Time Awake Since Boot: 5200 seconds

System Integrity Protection: enabled

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       EXC_I386_GPFLT
Exception Note:        EXC_CORPSE_NOTIFY

Termination Signal:    Segmentation fault: 11
Termination Reason:    Namespace SIGNAL, Code 0xb
Terminating Process:   exc handler [0]

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   clang                           0x0000000108bd3ac1 (anonymous namespace)::ConstantFoldConstantImpl(llvm::Constant const*, llvm::DataLayout const&, llvm::TargetLibraryInfo const*, llvm::SmallDenseMap<llvm::Constant*, llvm::Constant*, 4u, llvm::DenseMapInfo<llvm::Constant*>, llvm::detail::DenseMapPair<llvm::Constant*, llvm::Constant*> >&) (.llvm.77EEC815) + 161
1   clang                           0x0000000107bd53db combineInstructionsOverFunction(llvm::Function&, llvm::InstCombineWorklist&, llvm::AAResults*, llvm::AssumptionCache&, llvm::TargetLibraryInfo&, llvm::DominatorTree&, bool, llvm::LoopInfo*) + 1019
2   clang                           0x0000000107bd4f5d llvm::InstructionCombiningPass::runOnFunction(llvm::Function&) + 445
3   clang                           0x0000000107855ca2 llvm::FPPassManager::runOnFunction(llvm::Function&) + 498
4   clang                           0x000000010787f0a3 llvm::FPPassManager::runOnModule(llvm::Module&) + 67
5   clang                           0x0000000107859f35 llvm::legacy::PassManagerImpl::run(llvm::Module&) + 693
6   clang                           0x0000000107831559 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*, clang::BackendAction, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream> >) + 2681
7   clang                           0x00000001078024da clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) + 410
8   clang                           0x00000001074de749 clang::ParseAST(clang::Sema&, bool, bool) + 249
9   clang                           0x00000001074db7bc clang::FrontendAction::Execute() + 44
10  clang                           0x0000000107491b56 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) + 294
11  clang                           0x000000010748fca9 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) + 2329
12  clang                           0x0000000107459e64 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) + 1492
13  clang                           0x000000010745599c main + 13820
14  libdyld.dylib                   0x00007fff7bf93115 start + 1

Thread 0 crashed with X86 Thread State (64-bit):
  rax: 0x0000000000000009  rbx: 0x00007ffee87b04b0  rcx: 0x00007ffba9b14508  rdx: 0x00007ffee87b1410
  rdi: 0x00007ffee87b13d0  rsi: 0x00007ffba7417e78  rbp: 0x00007ffee87b0560  rsp: 0x00007ffee87b0490
   r8: 0x0000000000000001   r9: 0x00007ffbad9df930  r10: 0x00007ffba7417e78  r11: 0x00007ffbade00000
  r12: 0x00007ffba9b144c0  r13: 0x00007ffee87b13c8  r14: 0x04007ffba9cec8b8  r15: 0x00007ffba9b14508
  rip: 0x0000000108bd3ac1  rfl: 0x0000000000010206  cr2: 0x000000011529a000

Logical CPU:     3
Error Code:      0x00000000
Trap Number:     13

Binary Images:
       0x10744b000 -        0x10ac36ff7 +clang (9.0.0 - 900.0.39) <8A7F67A3-26BA-32A4-BB1D-2880F066E18B> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang
       0x11643c000 -        0x11648698f  dyld (519.2.2) <6695F30B-4E88-3C0B-9867-7D738C44A3E6> /usr/lib/dyld
    0x7fff7987c000 -     0x7fff798affff  libclosured.dylib (519.2.2) <48051216-5647-3643-B979-B77D0FD20011> /usr/lib/closure/libclosured.dylib
    0x7fff79d8e000 -     0x7fff79d8fff3  libSystem.B.dylib (1252) <47329E26-DC23-3EBA-9461-37755368327D> /usr/lib/libSystem.B.dylib
    0x7fff79fc2000 -     0x7fff7a018fff  libc++.1.dylib (400.9) <FCF5E1F6-2B04-3545-8004-F3AB32FED172> /usr/lib/libc++.1.dylib
    0x7fff7a019000 -     0x7fff7a03dff7  libc++abi.dylib (400.7) <217656D5-BC40-37FF-B322-91CB2AAD4F34> /usr/lib/libc++abi.dylib
    0x7fff7b06f000 -     0x7fff7b09fffb  libncurses.5.4.dylib (53) <030DF747-F71B-367A-83EE-2F30B7947929> /usr/lib/libncurses.5.4.dylib
    0x7fff7b38f000 -     0x7fff7b77d7e7  libobjc.A.dylib (723) <93A92316-DE1E-378C-8891-99720B50D075> /usr/lib/libobjc.A.dylib
    0x7fff7bd7a000 -     0x7fff7bd8cffb  libz.1.dylib (70) <48C67CFC-940D-3857-8DAD-857774605352> /usr/lib/libz.1.dylib
    0x7fff7be2a000 -     0x7fff7be2eff7  libcache.dylib (80) <354F3B7D-404E-3398-9EBF-65CA2CE65211> /usr/lib/system/libcache.dylib
    0x7fff7be2f000 -     0x7fff7be39ff3  libcommonCrypto.dylib (60118.30.2) <674286D3-7744-36A3-9AAA-49DFCD97A986> /usr/lib/system/libcommonCrypto.dylib
    0x7fff7be3a000 -     0x7fff7be41fff  libcompiler_rt.dylib (62) <4487CFBA-A5D7-3282-9E6B-94CAD7BE507E> /usr/lib/system/libcompiler_rt.dylib
    0x7fff7be42000 -     0x7fff7be4affb  libcopyfile.dylib (146.30.2) <2C7C67D7-562B-3FFA-973D-BACF4C10E1EC> /usr/lib/system/libcopyfile.dylib
    0x7fff7be4b000 -     0x7fff7bed0fff  libcorecrypto.dylib (562.30.10) <8A53EFE1-AFCA-3676-BEE1-FA5ED9F0E222> /usr/lib/system/libcorecrypto.dylib
    0x7fff7bf58000 -     0x7fff7bf91ff7  libdispatch.dylib (913.30.4) <7D0E3183-282B-3FEE-A734-2C0ADC092084> /usr/lib/system/libdispatch.dylib
    0x7fff7bf92000 -     0x7fff7bfafff7  libdyld.dylib (519.2.2) <C50D02BC-A333-3313-B787-02F255A6783F> /usr/lib/system/libdyld.dylib
    0x7fff7bfb0000 -     0x7fff7bfb0ffb  libkeymgr.dylib (28) <6D84A96F-C65B-38EC-BDB5-21FD2C97E7B2> /usr/lib/system/libkeymgr.dylib
    0x7fff7bfbe000 -     0x7fff7bfbeff7  liblaunch.dylib (1205.30.29) <E66F58ED-C15E-3DFB-BC22-A861E13918C6> /usr/lib/system/liblaunch.dylib
    0x7fff7bfbf000 -     0x7fff7bfc3ffb  libmacho.dylib (900.0.1) <756F2553-07B6-3B42-ACEA-2F0F1A5E8D0F> /usr/lib/system/libmacho.dylib
    0x7fff7bfc4000 -     0x7fff7bfc6ff3  libquarantine.dylib (86) <6AC8773F-3817-3D82-99C2-01BABB9C3CBB> /usr/lib/system/libquarantine.dylib
    0x7fff7bfc7000 -     0x7fff7bfc8ff3  libremovefile.dylib (45) <912FA211-DD8C-3C92-8424-21B89F8B10FD> /usr/lib/system/libremovefile.dylib
    0x7fff7bfc9000 -     0x7fff7bfe0fff  libsystem_asl.dylib (356.1.1) <94972913-9DF0-3C78-847C-43E58919E3DA> /usr/lib/system/libsystem_asl.dylib
    0x7fff7bfe1000 -     0x7fff7bfe1fff  libsystem_blocks.dylib (67) <F2493BB5-B1C6-3C4D-9F1F-1B402E0F1DB7> /usr/lib/system/libsystem_blocks.dylib
    0x7fff7bfe2000 -     0x7fff7c06bff7  libsystem_c.dylib (1244.30.3) <E0136C71-0648-36F0-9F84-82EA2748A8D7> /usr/lib/system/libsystem_c.dylib
    0x7fff7c06c000 -     0x7fff7c06fffb  libsystem_configuration.dylib (963.30.1) <0F8D0B76-4F7D-34EC-AB6C-50F9465809DA> /usr/lib/system/libsystem_configuration.dylib
    0x7fff7c070000 -     0x7fff7c073ffb  libsystem_coreservices.dylib (51) <21A488D0-2D07-344E-8631-CC8B2A246F35> /usr/lib/system/libsystem_coreservices.dylib
    0x7fff7c074000 -     0x7fff7c075fff  libsystem_darwin.dylib (1244.30.3) <2F750CB1-BC26-3FA3-AE59-553EE30D451B> /usr/lib/system/libsystem_darwin.dylib
    0x7fff7c076000 -     0x7fff7c07cff7  libsystem_dnssd.dylib (878.30.4) <EB9BB165-45A4-367C-B33A-688D4F383A95> /usr/lib/system/libsystem_dnssd.dylib
    0x7fff7c07d000 -     0x7fff7c0c6ff7  libsystem_info.dylib (517.30.1) <7D79E167-4B5C-3833-81EE-3AF3FB53616D> /usr/lib/system/libsystem_info.dylib
    0x7fff7c0c7000 -     0x7fff7c0ecff7  libsystem_kernel.dylib (4570.41.2) <5155A4C3-825B-3178-AC51-0D2D2F2A6618> /usr/lib/system/libsystem_kernel.dylib
    0x7fff7c0ed000 -     0x7fff7c138fcb  libsystem_m.dylib (3146) <ABB1B85F-9FFE-31B8-AD4F-E39A30794A93> /usr/lib/system/libsystem_m.dylib
    0x7fff7c139000 -     0x7fff7c158fff  libsystem_malloc.dylib (140.40.1) <36B22C99-D772-3039-9A4C-AA31389965E1> /usr/lib/system/libsystem_malloc.dylib
    0x7fff7c159000 -     0x7fff7c1fdff3  libsystem_network.dylib (1229.30.11) <40BAD301-8744-3AD8-A688-E7925C587B00> /usr/lib/system/libsystem_network.dylib
    0x7fff7c1fe000 -     0x7fff7c208ffb  libsystem_networkextension.dylib (767.40.1) <CEDC330D-28F0-3902-BEB0-10B92ACEC69F> /usr/lib/system/libsystem_networkextension.dylib
    0x7fff7c209000 -     0x7fff7c212ff3  libsystem_notify.dylib (172) <98EA3D62-7C86-30DE-8261-D020D2F1EFF3> /usr/lib/system/libsystem_notify.dylib
    0x7fff7c213000 -     0x7fff7c21aff7  libsystem_platform.dylib (161.20.1) <C049250F-8C35-314D-810F-4E28AEAED983> /usr/lib/system/libsystem_platform.dylib
    0x7fff7c21b000 -     0x7fff7c226fff  libsystem_pthread.dylib (301.30.1) <ABA848E1-6978-3B42-A3A7-608B2C36FA93> /usr/lib/system/libsystem_pthread.dylib
    0x7fff7c227000 -     0x7fff7c22aff3  libsystem_sandbox.dylib (765.40.2) <922D3D15-AB4C-3F1A-A94F-39214AF1ADB3> /usr/lib/system/libsystem_sandbox.dylib
    0x7fff7c22b000 -     0x7fff7c22cff3  libsystem_secinit.dylib (30) <F06ADB8F-9E94-34A7-B3C9-2C22FDD14BAD> /usr/lib/system/libsystem_secinit.dylib
    0x7fff7c22d000 -     0x7fff7c234ff7  libsystem_symptoms.dylib (820.30.7) <DC3586C2-AA56-3419-88D3-FC0DBF08E3C0> /usr/lib/system/libsystem_symptoms.dylib
    0x7fff7c235000 -     0x7fff7c248ff7  libsystem_trace.dylib (829.30.14) <69EBF017-D40F-30D7-9B0B-BFC862D761A5> /usr/lib/system/libsystem_trace.dylib
    0x7fff7c24a000 -     0x7fff7c24fff7  libunwind.dylib (35.3) <6D4FCD49-D2A9-3233-95C7-A7635CE265F2> /usr/lib/system/libunwind.dylib
    0x7fff7c250000 -     0x7fff7c27cff7  libxpc.dylib (1205.30.29) <F7E5F1BC-614B-39CB-B6CE-92A9C7B7EC0B> /usr/lib/system/libxpc.dylib

External Modification Summary:
  Calls made by other processes targeting this process:
    task_for_pid: 0
    thread_create: 0
    thread_set_state: 0
  Calls made by this process:
    task_for_pid: 0
    thread_create: 0
    thread_set_state: 0
  Calls made by all processes on this machine:
    task_for_pid: 2737
    thread_create: 0
    thread_set_state: 0

VM Region Summary:
ReadOnly portion of Libraries: Total=260.4M resident=0K(0%) swapped_out_or_unallocated=260.4M(100%)
Writable regions: Total=209.0M written=0K(0%) resident=0K(0%) swapped_out=0K(0%) unallocated=209.0M(100%)

                                VIRTUAL   REGION 
REGION TYPE                        SIZE    COUNT (non-coalesced) 
===========                     =======  ======= 
Kernel Alloc Once                    8K        2 
MALLOC                           200.6M       43 
MALLOC guard page                   16K        5 
STACK GUARD                       56.0M        2 
Stack                             8192K        2 
__DATA                            4956K       45 
__LINKEDIT                       195.6M        4 
__TEXT                            64.9M       44 
mapped file                       7596K       68 
shared memory                        8K        3 
===========                     =======  ======= 
TOTAL                            537.3M      208 

ssnl commented 6 years ago

We've also seen clang segfault a few times on CI builds. Unfortunately we have no clue what causes that. But switching to XCode 9 clang seems to solve the issue. Could you try that?

rudedogg commented 6 years ago

Thanks @SsnL, I gave that a try but no luck:

My output from gcc --version:

Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 9.0.0 (clang-900.0.39.2)
Target: x86_64-apple-darwin17.4.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

My build command:

env CMAKE_PREFIX_PATH=/usr/local/anaconda3 MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python setup.py install

Error output:

[ 39%] Building NVCC (Device) object CMakeFiles/THC.dir/generated/THC_generated_THCTensorMathCompareTChar.cu.o
nvcc error   : 'ptxas' died due to signal 11 (Invalid memory reference)
CMake Error at THC_generated_THCTensorMathCompareTByte.cu.o.cmake:267 (message):
  Error generating file
  /Users/rudedogg/Development/Contrib/pytorch/torch/lib/build/THC/CMakeFiles/THC.dir/generated/./THC_generated_THCTensorMathCompareTByte.cu.o

make[2]: *** [CMakeFiles/THC.dir/generated/THC_generated_THCTensorMathCompareTByte.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
ptxas /var/folders/b7/fqvwqsh503zbn1_xj3qlwh0h0000gn/T//tmpxft_00011800_00000000-5_THCTensorIndex.ptx, line 402647; error   : Unknown modifier '.pavam'
ptxas fatal   : Ptx assembly aborted due to errors
CMake Error at THC_generated_THCTensorIndex.cu.o.cmake:267 (message):
  Error generating file
  /Users/rudedogg/Development/Contrib/pytorch/torch/lib/build/THC/CMakeFiles/THC.dir//./THC_generated_THCTensorIndex.cu.o

make[2]: *** [CMakeFiles/THC.dir/THC_generated_THCTensorIndex.cu.o] Error 1
make[1]: *** [CMakeFiles/THC.dir/all] Error 2
make: *** [all] Error 2

I also tried a few other build options, but ran into errors with them as well:

env CMAKE_PREFIX_PATH=/usr/local/anaconda3 MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ CXXFLAGS=-stdlib=libc++ python setup.py install
env CMAKE_PREFIX_PATH=/usr/local/anaconda3 MACOSX_DEPLOYMENT_TARGET=10.13 CC=clang CXX=clang++ CXXFLAGS=-stdlib=libc++ python setup.py install
env CMAKE_PREFIX_PATH=/usr/local/anaconda3 MACOSX_DEPLOYMENT_TARGET=10.13 CC=clang CXX=clang++ python setup.py install
env CMAKE_PREFIX_PATH=/usr/local/anaconda3 MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python setup.py install

Is there anywhere I can get a binary of pytorch with cuda support for macOS?

ssnl commented 6 years ago

cc @yf225 Will do you know more about what's going wrong here?

yf225 commented 6 years ago

@rudedogg I suspect that the second issue is caused by nvcc and not clang. Do you have the output from nvcc --version?

rudedogg commented 6 years ago

@yf225 Here's what I get from /usr/local/cuda/bin/nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Tue_Dec_19_21:36:29_CST_2017
Cuda compilation tools, release 9.1, V9.1.128
yf225 commented 6 years ago

@rudedogg Hmm it should work on CUDA 9.1. Could you try running the install script with MAX_JOBS=1, and see if it still fails on the same file?

rudedogg commented 6 years ago

@yf225 I managed to get a working build once. The command was one I tried earlier:

env CMAKE_PREFIX_PATH=/usr/local/anaconda3 MACOSX_DEPLOYMENT_TARGET=10.13 CC=clang CXX=clang++ CXXFLAGS=-stdlib=libc++ python setup.py install

I tried it again and got the nvcc error : 'cicc' died due to signal 11 (Invalid memory reference) error, so it appears to be pretty random.

I also tried withMAX_JOBS=1 added to the env variables, here's the output:

[ 42%] Building NVCC (Device) object CMakeFiles/THC.dir/generated/THC_generated_THCTensorMathReduceChar.cu.o
clang: error: unable to execute command: Segmentation fault: 11
clang: error: clang frontend command failed due to signal (use -v to see invocation)
Apple LLVM version 9.0.0 (clang-900.0.39.2)
Target: x86_64-apple-darwin17.4.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
clang: note: diagnostic msg: PLEASE submit a bug report to http://developer.apple.com/bugreporter/ and include the crash backtrace, preprocessed source, and associated run script.
clang: note: diagnostic msg:
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang: note: diagnostic msg: /var/folders/b7/fqvwqsh503zbn1_xj3qlwh0h0000gn/T/tmpxft_000035cb_00000000-5_THCTensorMathCompareChar-d4ae6d.cpp
clang: note: diagnostic msg: /var/folders/b7/fqvwqsh503zbn1_xj3qlwh0h0000gn/T/tmpxft_000035cb_00000000-5_THCTensorMathCompareChar-d4ae6d.sh
clang: note: diagnostic msg: Crash backtrace is located in
clang: note: diagnostic msg: /Users/rudedogg/Library/Logs/DiagnosticReports/clang_<YYYY-MM-DD-HHMMSS>_<hostname>.crash
clang: note: diagnostic msg: (choose the .crash file that corresponds to your crash)
clang: note: diagnostic msg:

********************
[ 43%] Building NVCC (Device) object CMakeFiles/THC.dir/generated/THC_generated_THCTensorMaskedChar.cu.o
CMake Error at THC_generated_THCTensorMathCompareChar.cu.o.cmake:267 (message):
  Error generating file
  /Users/rudedogg/Development/Contrib/pytorch/torch/lib/build/THC/CMakeFiles/THC.dir/generated/./THC_generated_THCTensorMathCompareChar.cu.o

make[2]: *** [CMakeFiles/THC.dir/generated/THC_generated_THCTensorMathCompareChar.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/THC.dir/all] Error 2

And the .crash output:

Process:               clang [13820]
Path:                  /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang
Identifier:            clang
Version:               9.0.0 (900.0.39)
Code Type:             X86-64 (Native)
Parent Process:        clang [13819]
Responsible:           clang [13820]
User ID:               501

Date/Time:             2018-02-14 16:38:08.398 -0700
OS Version:            Mac OS X 10.13.3 (17D47)
Report Version:        12
Anonymous UUID:        C9E0E8D3-1D83-174F-5E70-ABDE4079ED63

Time Awake Since Boot: 1800 seconds

System Integrity Protection: enabled

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       EXC_I386_GPFLT
Exception Note:        EXC_CORPSE_NOTIFY

Termination Signal:    Segmentation fault: 11
Termination Reason:    Namespace SIGNAL, Code 0xb
Terminating Process:   exc handler [0]

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   clang                           0x000000010dc8a290 llvm::SelectionDAG::LegalizeTypes() + 2560
1   clang                           0x000000010dbe7c6d llvm::SelectionDAGISel::CodeGenAndEmitDAG() + 333
2   clang                           0x000000010db34eb6 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) + 6118
3   clang                           0x000000010db2cfd6 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) + 998
4   clang                           0x000000010e4c0404 (anonymous namespace)::X86DAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) + 20
5   clang                           0x000000010db2b09d llvm::MachineFunctionPass::runOnFunction(llvm::Function&) + 125
6   clang                           0x000000010dae7ca2 llvm::FPPassManager::runOnFunction(llvm::Function&) + 498
7   clang                           0x000000010db110a3 llvm::FPPassManager::runOnModule(llvm::Module&) + 67
8   clang                           0x000000010daebf35 llvm::legacy::PassManagerImpl::run(llvm::Module&) + 693
9   clang                           0x000000010dac356c clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*, clang::BackendAction, std::__1::unique_ptr<llvm::raw_pwrite_stream, std::__1::default_delete<llvm::raw_pwrite_stream> >) + 2700
10  clang                           0x000000010da944da clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) + 410
11  clang                           0x000000010d770749 clang::ParseAST(clang::Sema&, bool, bool) + 249
12  clang                           0x000000010d76d7bc clang::FrontendAction::Execute() + 44
13  clang                           0x000000010d723b56 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) + 294
14  clang                           0x000000010d721ca9 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) + 2329
15  clang                           0x000000010d6ebe64 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) + 1492
16  clang                           0x000000010d6e799c main + 13820
17  libdyld.dylib                   0x00007fff5309c115 start + 1

Thread 0 crashed with X86 Thread State (64-bit):
  rax: 0x0000000000000004  rbx: 0x0000000000000000  rcx: 0x0000000000000003  rdx: 0x00007ffbe440cff0
  rdi: 0x00007ffee251e418  rsi: 0x00007ffbe69badd0  rbp: 0x00007ffee251e4e0  rsp: 0x00007ffee251d4e0
   r8: 0x0000000000000000   r9: 0x0000000000008082  r10: 0x0000000000000000  r11: 0x00000000000002e0
  r12: 0x00007ffee251d4e0  r13: 0x0000000000000001  r14: 0x0400000000000000  r15: 0x00007ffbeb00fbb0
  rip: 0x000000010dc8a290  rfl: 0x0000000000010206  cr2: 0x00007ffbeb08e000

Logical CPU:     1
Error Code:      0x00000000
Trap Number:     13

Binary Images:
       0x10d6dd000 -        0x110ec8ff7 +clang (9.0.0 - 900.0.39) <8A7F67A3-26BA-32A4-BB1D-2880F066E18B> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang
       0x11669e000 -        0x1166e898f  dyld (519.2.2) <6695F30B-4E88-3C0B-9867-7D738C44A3E6> /usr/lib/dyld
    0x7fff50985000 -     0x7fff509b8fff  libclosured.dylib (519.2.2) <48051216-5647-3643-B979-B77D0FD20011> /usr/lib/closure/libclosured.dylib
    0x7fff50e97000 -     0x7fff50e98ff3  libSystem.B.dylib (1252) <47329E26-DC23-3EBA-9461-37755368327D> /usr/lib/libSystem.B.dylib
    0x7fff510cb000 -     0x7fff51121fff  libc++.1.dylib (400.9) <FCF5E1F6-2B04-3545-8004-F3AB32FED172> /usr/lib/libc++.1.dylib
    0x7fff51122000 -     0x7fff51146ff7  libc++abi.dylib (400.7) <217656D5-BC40-37FF-B322-91CB2AAD4F34> /usr/lib/libc++abi.dylib
    0x7fff52178000 -     0x7fff521a8ffb  libncurses.5.4.dylib (53) <030DF747-F71B-367A-83EE-2F30B7947929> /usr/lib/libncurses.5.4.dylib
    0x7fff52498000 -     0x7fff528867e7  libobjc.A.dylib (723) <93A92316-DE1E-378C-8891-99720B50D075> /usr/lib/libobjc.A.dylib
    0x7fff52e83000 -     0x7fff52e95ffb  libz.1.dylib (70) <48C67CFC-940D-3857-8DAD-857774605352> /usr/lib/libz.1.dylib
    0x7fff52f33000 -     0x7fff52f37ff7  libcache.dylib (80) <354F3B7D-404E-3398-9EBF-65CA2CE65211> /usr/lib/system/libcache.dylib
    0x7fff52f38000 -     0x7fff52f42ff3  libcommonCrypto.dylib (60118.30.2) <674286D3-7744-36A3-9AAA-49DFCD97A986> /usr/lib/system/libcommonCrypto.dylib
    0x7fff52f43000 -     0x7fff52f4afff  libcompiler_rt.dylib (62) <4487CFBA-A5D7-3282-9E6B-94CAD7BE507E> /usr/lib/system/libcompiler_rt.dylib
    0x7fff52f4b000 -     0x7fff52f53ffb  libcopyfile.dylib (146.30.2) <2C7C67D7-562B-3FFA-973D-BACF4C10E1EC> /usr/lib/system/libcopyfile.dylib
    0x7fff52f54000 -     0x7fff52fd9fff  libcorecrypto.dylib (562.30.10) <8A53EFE1-AFCA-3676-BEE1-FA5ED9F0E222> /usr/lib/system/libcorecrypto.dylib
    0x7fff53061000 -     0x7fff5309aff7  libdispatch.dylib (913.30.4) <7D0E3183-282B-3FEE-A734-2C0ADC092084> /usr/lib/system/libdispatch.dylib
    0x7fff5309b000 -     0x7fff530b8ff7  libdyld.dylib (519.2.2) <C50D02BC-A333-3313-B787-02F255A6783F> /usr/lib/system/libdyld.dylib
    0x7fff530b9000 -     0x7fff530b9ffb  libkeymgr.dylib (28) <6D84A96F-C65B-38EC-BDB5-21FD2C97E7B2> /usr/lib/system/libkeymgr.dylib
    0x7fff530c7000 -     0x7fff530c7ff7  liblaunch.dylib (1205.30.29) <E66F58ED-C15E-3DFB-BC22-A861E13918C6> /usr/lib/system/liblaunch.dylib
    0x7fff530c8000 -     0x7fff530ccffb  libmacho.dylib (900.0.1) <756F2553-07B6-3B42-ACEA-2F0F1A5E8D0F> /usr/lib/system/libmacho.dylib
    0x7fff530cd000 -     0x7fff530cfff3  libquarantine.dylib (86) <6AC8773F-3817-3D82-99C2-01BABB9C3CBB> /usr/lib/system/libquarantine.dylib
    0x7fff530d0000 -     0x7fff530d1ff3  libremovefile.dylib (45) <912FA211-DD8C-3C92-8424-21B89F8B10FD> /usr/lib/system/libremovefile.dylib
    0x7fff530d2000 -     0x7fff530e9fff  libsystem_asl.dylib (356.1.1) <94972913-9DF0-3C78-847C-43E58919E3DA> /usr/lib/system/libsystem_asl.dylib
    0x7fff530ea000 -     0x7fff530eafff  libsystem_blocks.dylib (67) <F2493BB5-B1C6-3C4D-9F1F-1B402E0F1DB7> /usr/lib/system/libsystem_blocks.dylib
    0x7fff530eb000 -     0x7fff53174ff7  libsystem_c.dylib (1244.30.3) <E0136C71-0648-36F0-9F84-82EA2748A8D7> /usr/lib/system/libsystem_c.dylib
    0x7fff53175000 -     0x7fff53178ffb  libsystem_configuration.dylib (963.30.1) <0F8D0B76-4F7D-34EC-AB6C-50F9465809DA> /usr/lib/system/libsystem_configuration.dylib
    0x7fff53179000 -     0x7fff5317cffb  libsystem_coreservices.dylib (51) <21A488D0-2D07-344E-8631-CC8B2A246F35> /usr/lib/system/libsystem_coreservices.dylib
    0x7fff5317d000 -     0x7fff5317efff  libsystem_darwin.dylib (1244.30.3) <2F750CB1-BC26-3FA3-AE59-553EE30D451B> /usr/lib/system/libsystem_darwin.dylib
    0x7fff5317f000 -     0x7fff53185ff7  libsystem_dnssd.dylib (878.30.4) <EB9BB165-45A4-367C-B33A-688D4F383A95> /usr/lib/system/libsystem_dnssd.dylib
    0x7fff53186000 -     0x7fff531cfff7  libsystem_info.dylib (517.30.1) <7D79E167-4B5C-3833-81EE-3AF3FB53616D> /usr/lib/system/libsystem_info.dylib
    0x7fff531d0000 -     0x7fff531f5ff7  libsystem_kernel.dylib (4570.41.2) <5155A4C3-825B-3178-AC51-0D2D2F2A6618> /usr/lib/system/libsystem_kernel.dylib
    0x7fff531f6000 -     0x7fff53241fcb  libsystem_m.dylib (3146) <ABB1B85F-9FFE-31B8-AD4F-E39A30794A93> /usr/lib/system/libsystem_m.dylib
    0x7fff53242000 -     0x7fff53261fff  libsystem_malloc.dylib (140.40.1) <36B22C99-D772-3039-9A4C-AA31389965E1> /usr/lib/system/libsystem_malloc.dylib
    0x7fff53262000 -     0x7fff53306ff3  libsystem_network.dylib (1229.30.11) <40BAD301-8744-3AD8-A688-E7925C587B00> /usr/lib/system/libsystem_network.dylib
    0x7fff53307000 -     0x7fff53311ffb  libsystem_networkextension.dylib (767.40.1) <CEDC330D-28F0-3902-BEB0-10B92ACEC69F> /usr/lib/system/libsystem_networkextension.dylib
    0x7fff53312000 -     0x7fff5331bff3  libsystem_notify.dylib (172) <98EA3D62-7C86-30DE-8261-D020D2F1EFF3> /usr/lib/system/libsystem_notify.dylib
    0x7fff5331c000 -     0x7fff53323ff7  libsystem_platform.dylib (161.20.1) <C049250F-8C35-314D-810F-4E28AEAED983> /usr/lib/system/libsystem_platform.dylib
    0x7fff53324000 -     0x7fff5332ffff  libsystem_pthread.dylib (301.30.1) <ABA848E1-6978-3B42-A3A7-608B2C36FA93> /usr/lib/system/libsystem_pthread.dylib
    0x7fff53330000 -     0x7fff53333ff3  libsystem_sandbox.dylib (765.40.2) <922D3D15-AB4C-3F1A-A94F-39214AF1ADB3> /usr/lib/system/libsystem_sandbox.dylib
    0x7fff53334000 -     0x7fff53335ff3  libsystem_secinit.dylib (30) <F06ADB8F-9E94-34A7-B3C9-2C22FDD14BAD> /usr/lib/system/libsystem_secinit.dylib
    0x7fff53336000 -     0x7fff5333dff7  libsystem_symptoms.dylib (820.30.7) <DC3586C2-AA56-3419-88D3-FC0DBF08E3C0> /usr/lib/system/libsystem_symptoms.dylib
    0x7fff5333e000 -     0x7fff53351ff7  libsystem_trace.dylib (829.30.14) <69EBF017-D40F-30D7-9B0B-BFC862D761A5> /usr/lib/system/libsystem_trace.dylib
    0x7fff53353000 -     0x7fff53358ff7  libunwind.dylib (35.3) <6D4FCD49-D2A9-3233-95C7-A7635CE265F2> /usr/lib/system/libunwind.dylib
    0x7fff53359000 -     0x7fff53385ff7  libxpc.dylib (1205.30.29) <F7E5F1BC-614B-39CB-B6CE-92A9C7B7EC0B> /usr/lib/system/libxpc.dylib

External Modification Summary:
  Calls made by other processes targeting this process:
    task_for_pid: 0
    thread_create: 0
    thread_set_state: 0
  Calls made by this process:
    task_for_pid: 0
    thread_create: 0
    thread_set_state: 0
  Calls made by all processes on this machine:
    task_for_pid: 1288
    thread_create: 0
    thread_set_state: 0

VM Region Summary:
ReadOnly portion of Libraries: Total=260.4M resident=0K(0%) swapped_out_or_unallocated=260.4M(100%)
Writable regions: Total=204.4M written=0K(0%) resident=0K(0%) swapped_out=0K(0%) unallocated=204.4M(100%)

                                VIRTUAL   REGION 
REGION TYPE                        SIZE    COUNT (non-coalesced) 
===========                     =======  ======= 
Kernel Alloc Once                    8K        2 
MALLOC                           140.0M       19 
MALLOC guard page                   16K        5 
STACK GUARD                          4K        2 
Stack                             64.0M        2 
__DATA                            4956K       45 
__LINKEDIT                       195.6M        4 
__TEXT                            64.9M       44 
mapped file                       5764K        7 
shared memory                        8K        3 
===========                     =======  ======= 
TOTAL                            474.9M      123 
rudedogg commented 6 years ago

Running env CMAKE_PREFIX_PATH=/usr/local/anaconda3 MACOSX_DEPLOYMENT_TARGET=10.13 CC=clang CXX=clang++ CXXFLAGS=-stdlib=libc++ python setup.py install eventually resulted in a successful build/install for me. It took about 10 tries. I'm going to leave things as they are so I can take the fast.ai course.

ezyang commented 6 years ago

@rudedogg I don't know if you still have the files, but if you could upload

clang: note: diagnostic msg: /var/folders/b7/fqvwqsh503zbn1_xj3qlwh0h0000gn/T/tmpxft_000035cb_00000000-5_THCTensorMathCompareChar-d4ae6d.cpp
clang: note: diagnostic msg: /var/folders/b7/fqvwqsh503zbn1_xj3qlwh0h0000gn/T/tmpxft_000035cb_00000000-5_THCTensorMathCompareChar-d4ae6d.sh

somewhere (maybe your favorite pastebin) that would be helpful. At the very least we can report this bug upstream.

rudedogg commented 6 years ago

@ezyang Sorry, I rebooted and the files are no longer there. I can revisit this in the future and upload them? I'm hesitant to try building again in case I can't get it to go through.

I'll let you decide whether to keep the issue open or not.

alejandrojapkin commented 6 years ago

@rudedogg Please participate on https://github.com/pytorch/pytorch/issues/3047

ngimel commented 4 years ago

Closing, please reopen if needed