NVIDIA / apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
BSD 3-Clause "New" or "Revised" License
8.42k stars 1.4k forks source link

Installation Problem: RuntimeError: Error compiling objects for extension #1779

Open wwma opened 8 months ago

wwma commented 8 months ago

Describe the Bug I follow the README to install apex using pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./ , but it fails. the error message is :

RuntimeError: Error compiling objects for extension
  error: subprocess-exited-with-error

  × Building wheel for apex (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> See above for output.

  note: This error originates from a subprocess, and is likely not a problem with pip.
  full command: /home/myt/anaconda3/envs/graph/bin/python /home/myt/anaconda3/envs/graph/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py build_wheel /tmp/tmp6pp0x0mg
  cwd: /home/myt/GPTrans/apex
  Building wheel for apex (pyproject.toml) ... error
  ERROR: Failed building wheel for apex
Failed to build apex
ERROR: Could not build wheels for apex, which is required to install pyproject.toml-based projects

the torch.version.cuda is as same as the CUDA which is 11.3. So how to settle the problem

Environment

Collecting environment information...
PyTorch version: 1.12.0
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.6 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.31

Python version: 3.8.18 (default, Sep 11 2023, 13:40:15)  [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.15.0-97-generic-x86_64-with-glibc2.17
Is CUDA available: True
CUDA runtime version: 11.3.58
GPU models and configuration: 
GPU 0: NVIDIA GeForce RTX 4090
GPU 1: NVIDIA GeForce RTX 4090

Nvidia driver version: 535.161.07
cuDNN version: Probably one of the following:
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn.so.8.4.1
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_adv_infer.so.8.4.1
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_adv_train.so.8.4.1
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8.4.1
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_cnn_train.so.8.4.1
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_ops_infer.so.8.4.1
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_ops_train.so.8.4.1
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.24.3
[pip3] torch==1.12.0
[pip3] torch-geometric==1.7.2
[pip3] torch-scatter==2.0.9
[pip3] torch-sparse==0.6.14
[pip3] torchaudio==0.12.0+cu113
[pip3] torchvision==0.13.0+cu113
[conda] blas                      1.0                         mkl  
[conda] cudatoolkit               11.3.1               h2bc3f7f_2  
[conda] ffmpeg                    4.3                  hf484d3e_0    pytorch
[conda] mkl                       2023.1.0         h213fc3f_46344  
[conda] mkl-service               2.4.0            py38h5eee18b_1  
[conda] mkl_fft                   1.3.8            py38h5eee18b_0  
[conda] mkl_random                1.2.4            py38hdb19cb5_0  
[conda] numpy                     1.24.4                   pypi_0    pypi
[conda] numpy-base                1.24.3           py38h060ed82_1  
[conda] pytorch                   1.12.0          py3.8_cuda11.3_cudnn8.3.2_0    pytorch
[conda] pytorch-mutex             1.0                        cuda    pytorch
[conda] torch-geometric           1.7.2                    pypi_0    pypi
[conda] torch-scatter             2.0.9                    pypi_0    pypi
[conda] torch-sparse              0.6.14                   pypi_0    pypi
[conda] torchaudio                0.12.0+cu113             pypi_0    pypi
[conda] torchvision               0.13.0+cu113             pypi_0    pypi
Pytorch
1.12.0

nvcc -V

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Mar_21_19:15:46_PDT_2021
Cuda compilation tools, release 11.3, V11.3.58
Build cuda_11.3.r11.3/compiler.29745058_0
awer-A commented 8 months ago

@wwma Have you solved this problem now? I also encountered the same problem

wwma commented 8 months ago

@wwma Have you solved this problem now? I also encountered the same problem

I solved it following https://github.com/NVIDIA/apex/issues/1737#issuecomment-1762662648

draym28 commented 7 months ago

I solve this following #1748.