NVIDIA / apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
BSD 3-Clause "New" or "Revised" License
8.44k stars 1.41k forks source link

Failing to install apex #1609

Open celsofranssa opened 1 year ago

celsofranssa commented 1 year ago
LightXML/apex$ pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
WARNING: Disabling all use of wheels due to the use of --build-option / --global-option / --install-option.
Using pip 22.0.2 from /home/celso/projects/venvs/LightXML/lib/python3.8/site-packages/pip (python 3.8)
Processing /home/celso/projects/LightXML/apex
  Running command python setup.py egg_info
  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "/home/celso/projects/LightXML/apex/setup.py", line 130, in <module>
      _, bare_metal_version = get_cuda_bare_metal_version(CUDA_HOME)
    File "/home/celso/projects/LightXML/apex/setup.py", line 17, in get_cuda_bare_metal_version
      raw_output = subprocess.check_output([cuda_dir + "/bin/nvcc", "-V"], universal_newlines=True)
  TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'

  torch.__version__  = 1.5.1

  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> See above for output.

  note: This error originates from a subprocess, and is likely not a problem with pip.
  full command: /home/celso/projects/venvs/LightXML/bin/python -c '
  exec(compile('"'"''"'"''"'"'
  # This is <pip-setuptools-caller> -- a caller that pip uses to run setup.py
  #
  # - It imports setuptools before invoking setup.py, to enable projects that directly
  #   import from `distutils.core` to work with newer packaging standards.
  # - It provides a clear error message when setuptools is not installed.
  # - It sets `sys.argv[0]` to the underlying `setup.py`, when invoking `setup.py` so
  #   setuptools doesn'"'"'t think the script is `-c`. This avoids the following warning:
  #     manifest_maker: standard file '"'"'-c'"'"' not found".
  # - It generates a shim setup.py, for handling setup.cfg-only projects.
  import os, sys, tokenize

  try:
      import setuptools
  except ImportError as error:
      print(
          "ERROR: Can not execute `setup.py` since setuptools is not available in "
          "the build environment.",
          file=sys.stderr,
      )
      sys.exit(1)

  __file__ = %r
  sys.argv[0] = __file__

  if os.path.exists(__file__):
      filename = __file__
      with tokenize.open(__file__) as f:
          setup_py_code = f.read()
  else:
      filename = "<auto-generated setuptools caller>"
      setup_py_code = "from setuptools import setup; setup()"

  exec(compile(setup_py_code, filename, "exec"))
  '"'"''"'"''"'"' % ('"'"'/home/celso/projects/LightXML/apex/setup.py'"'"',), "<pip-setuptools-caller>", "exec"))' egg_info --egg-base /tmp/pip-pip-egg-info-_j2pmc0w
  cwd: /home/celso/projects/LightXML/apex/
  Preparing metadata (setup.py) ... error
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
crcrpar commented 1 year ago

torch.version = 1.5.1

the installed pytorch looks a bit too old

celsofranssa commented 1 year ago

torch.version = 1.5.1

the installed pytorch looks a bit too old

Which is the recommended pytorch version?

celsofranssa commented 1 year ago

Even with torch 1.13.1, the apex failed to install.

crcrpar commented 1 year ago

Even with torch 1.13.1, the apex failed to install.

with the same error message? where is cuda installed in your environment? could you try setting the environment variable of CUDA_HOME to the installed cuda path (if it's linux) before installing apex?

celsofranssa commented 1 year ago

The env:

$ echo $CUDA_HOME

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Jun__8_16:49:14_PDT_2022
Cuda compilation tools, release 11.7, V11.7.99
Build cuda_11.7.r11.7/compiler.31442593_

After updating the torch to 1.13.1 I still facing the following error:

$ pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
WARNING: Disabling all use of wheels due to the use of --build-option / --global-option / --install-option.
Using pip 22.0.2 from /home/celso/projects/venvs/LightXML/lib/python3.8/site-packages/pip (python 3.8)
Processing /home/celso/projects/LightXML/apex
  Running command python setup.py egg_info

  torch.__version__  = 1.13.1+cu117

  running egg_info
  creating /tmp/pip-pip-egg-info-0x15cma_/apex.egg-info
  writing /tmp/pip-pip-egg-info-0x15cma_/apex.egg-info/PKG-INFO
  writing dependency_links to /tmp/pip-pip-egg-info-0x15cma_/apex.egg-info/dependency_links.txt
  writing requirements to /tmp/pip-pip-egg-info-0x15cma_/apex.egg-info/requires.txt
  writing top-level names to /tmp/pip-pip-egg-info-0x15cma_/apex.egg-info/top_level.txt
  writing manifest file '/tmp/pip-pip-egg-info-0x15cma_/apex.egg-info/SOURCES.txt'
  reading manifest file '/tmp/pip-pip-egg-info-0x15cma_/apex.egg-info/SOURCES.txt'
  adding license file 'LICENSE'
  writing manifest file '/tmp/pip-pip-egg-info-0x15cma_/apex.egg-info/SOURCES.txt'
  Preparing metadata (setup.py) ... done
Requirement already satisfied: packaging>20.6 in /home/celso/projects/venvs/LightXML/lib/python3.8/site-packages (from apex==0.1) (23.0)
Skipping wheel build for apex, due to binaries being disabled for it.
Installing collected packages: apex
  Running command Running setup.py install for apex

  torch.__version__  = 1.13.1+cu117

  Compiling cuda extensions with
  nvcc: NVIDIA (R) Cuda compiler driver
  Copyright (c) 2005-2022 NVIDIA Corporation
  Built on Wed_Jun__8_16:49:14_PDT_2022
  Cuda compilation tools, release 11.7, V11.7.99
  Build cuda_11.7.r11.7/compiler.31442593_0
  from /usr/local/cuda-11.7/bin

  running install
  /home/celso/projects/venvs/LightXML/lib/python3.8/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
    warnings.warn(
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-3.8
  creating build/lib.linux-x86_64-3.8/apex
  copying apex/__init__.py -> build/lib.linux-x86_64-3.8/apex
  copying apex/_autocast_utils.py -> build/lib.linux-x86_64-3.8/apex
  creating build/lib.linux-x86_64-3.8/apex/parallel
  copying apex/parallel/multiproc.py -> build/lib.linux-x86_64-3.8/apex/parallel
  copying apex/parallel/__init__.py -> build/lib.linux-x86_64-3.8/apex/parallel
  copying apex/parallel/LARC.py -> build/lib.linux-x86_64-3.8/apex/parallel
  copying apex/parallel/distributed.py -> build/lib.linux-x86_64-3.8/apex/parallel
  copying apex/parallel/sync_batchnorm_kernel.py -> build/lib.linux-x86_64-3.8/apex/parallel
  copying apex/parallel/optimized_sync_batchnorm_kernel.py -> build/lib.linux-x86_64-3.8/apex/parallel
  copying apex/parallel/optimized_sync_batchnorm.py -> build/lib.linux-x86_64-3.8/apex/parallel
  copying apex/parallel/sync_batchnorm.py -> build/lib.linux-x86_64-3.8/apex/parallel
  creating build/lib.linux-x86_64-3.8/apex/normalization
  copying apex/normalization/__init__.py -> build/lib.linux-x86_64-3.8/apex/normalization
  copying apex/normalization/fused_layer_norm.py -> build/lib.linux-x86_64-3.8/apex/normalization
  creating build/lib.linux-x86_64-3.8/apex/fused_dense
  copying apex/fused_dense/fused_dense.py -> build/lib.linux-x86_64-3.8/apex/fused_dense
  copying apex/fused_dense/__init__.py -> build/lib.linux-x86_64-3.8/apex/fused_dense
  creating build/lib.linux-x86_64-3.8/apex/contrib
  copying apex/contrib/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib
  creating build/lib.linux-x86_64-3.8/apex/amp
  copying apex/amp/compat.py -> build/lib.linux-x86_64-3.8/apex/amp
  copying apex/amp/wrap.py -> build/lib.linux-x86_64-3.8/apex/amp
  copying apex/amp/__init__.py -> build/lib.linux-x86_64-3.8/apex/amp
  copying apex/amp/frontend.py -> build/lib.linux-x86_64-3.8/apex/amp
  copying apex/amp/_amp_state.py -> build/lib.linux-x86_64-3.8/apex/amp
  copying apex/amp/amp.py -> build/lib.linux-x86_64-3.8/apex/amp
  copying apex/amp/_initialize.py -> build/lib.linux-x86_64-3.8/apex/amp
  copying apex/amp/_process_optimizer.py -> build/lib.linux-x86_64-3.8/apex/amp
  copying apex/amp/scaler.py -> build/lib.linux-x86_64-3.8/apex/amp
  copying apex/amp/opt.py -> build/lib.linux-x86_64-3.8/apex/amp
  copying apex/amp/utils.py -> build/lib.linux-x86_64-3.8/apex/amp
  copying apex/amp/rnn_compat.py -> build/lib.linux-x86_64-3.8/apex/amp
  copying apex/amp/__version__.py -> build/lib.linux-x86_64-3.8/apex/amp
  copying apex/amp/handle.py -> build/lib.linux-x86_64-3.8/apex/amp
  creating build/lib.linux-x86_64-3.8/apex/fp16_utils
  copying apex/fp16_utils/loss_scaler.py -> build/lib.linux-x86_64-3.8/apex/fp16_utils
  copying apex/fp16_utils/__init__.py -> build/lib.linux-x86_64-3.8/apex/fp16_utils
  copying apex/fp16_utils/fp16_optimizer.py -> build/lib.linux-x86_64-3.8/apex/fp16_utils
  copying apex/fp16_utils/fp16util.py -> build/lib.linux-x86_64-3.8/apex/fp16_utils
  creating build/lib.linux-x86_64-3.8/apex/optimizers
  copying apex/optimizers/fused_adam.py -> build/lib.linux-x86_64-3.8/apex/optimizers
  copying apex/optimizers/fused_lamb.py -> build/lib.linux-x86_64-3.8/apex/optimizers
  copying apex/optimizers/__init__.py -> build/lib.linux-x86_64-3.8/apex/optimizers
  copying apex/optimizers/fused_mixed_precision_lamb.py -> build/lib.linux-x86_64-3.8/apex/optimizers
  copying apex/optimizers/fused_novograd.py -> build/lib.linux-x86_64-3.8/apex/optimizers
  copying apex/optimizers/fused_sgd.py -> build/lib.linux-x86_64-3.8/apex/optimizers
  copying apex/optimizers/fused_adagrad.py -> build/lib.linux-x86_64-3.8/apex/optimizers
  creating build/lib.linux-x86_64-3.8/apex/RNN
  copying apex/RNN/__init__.py -> build/lib.linux-x86_64-3.8/apex/RNN
  copying apex/RNN/cells.py -> build/lib.linux-x86_64-3.8/apex/RNN
  copying apex/RNN/RNNBackend.py -> build/lib.linux-x86_64-3.8/apex/RNN
  copying apex/RNN/models.py -> build/lib.linux-x86_64-3.8/apex/RNN
  creating build/lib.linux-x86_64-3.8/apex/mlp
  copying apex/mlp/__init__.py -> build/lib.linux-x86_64-3.8/apex/mlp
  copying apex/mlp/mlp.py -> build/lib.linux-x86_64-3.8/apex/mlp
  creating build/lib.linux-x86_64-3.8/apex/transformer
  copying apex/transformer/__init__.py -> build/lib.linux-x86_64-3.8/apex/transformer
  copying apex/transformer/log_util.py -> build/lib.linux-x86_64-3.8/apex/transformer
  copying apex/transformer/microbatches.py -> build/lib.linux-x86_64-3.8/apex/transformer
  copying apex/transformer/_ucc_util.py -> build/lib.linux-x86_64-3.8/apex/transformer
  copying apex/transformer/utils.py -> build/lib.linux-x86_64-3.8/apex/transformer
  copying apex/transformer/enums.py -> build/lib.linux-x86_64-3.8/apex/transformer
  copying apex/transformer/parallel_state.py -> build/lib.linux-x86_64-3.8/apex/transformer
  creating build/lib.linux-x86_64-3.8/apex/multi_tensor_apply
  copying apex/multi_tensor_apply/multi_tensor_apply.py -> build/lib.linux-x86_64-3.8/apex/multi_tensor_apply
  copying apex/multi_tensor_apply/__init__.py -> build/lib.linux-x86_64-3.8/apex/multi_tensor_apply
  creating build/lib.linux-x86_64-3.8/apex/contrib/sparsity
  copying apex/contrib/sparsity/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/sparsity
  copying apex/contrib/sparsity/permutation_lib.py -> build/lib.linux-x86_64-3.8/apex/contrib/sparsity
  copying apex/contrib/sparsity/asp.py -> build/lib.linux-x86_64-3.8/apex/contrib/sparsity
  copying apex/contrib/sparsity/sparse_masklib.py -> build/lib.linux-x86_64-3.8/apex/contrib/sparsity
  creating build/lib.linux-x86_64-3.8/apex/contrib/clip_grad
  copying apex/contrib/clip_grad/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/clip_grad
  copying apex/contrib/clip_grad/clip_grad.py -> build/lib.linux-x86_64-3.8/apex/contrib/clip_grad
  creating build/lib.linux-x86_64-3.8/apex/contrib/multihead_attn
  copying apex/contrib/multihead_attn/encdec_multihead_attn.py -> build/lib.linux-x86_64-3.8/apex/contrib/multihead_attn
  copying apex/contrib/multihead_attn/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/multihead_attn
  copying apex/contrib/multihead_attn/mask_softmax_dropout_func.py -> build/lib.linux-x86_64-3.8/apex/contrib/multihead_attn
  copying apex/contrib/multihead_attn/fast_self_multihead_attn_func.py -> build/lib.linux-x86_64-3.8/apex/contrib/multihead_attn
  copying apex/contrib/multihead_attn/encdec_multihead_attn_func.py -> build/lib.linux-x86_64-3.8/apex/contrib/multihead_attn
  copying apex/contrib/multihead_attn/self_multihead_attn_func.py -> build/lib.linux-x86_64-3.8/apex/contrib/multihead_attn
  copying apex/contrib/multihead_attn/fast_encdec_multihead_attn_norm_add_func.py -> build/lib.linux-x86_64-3.8/apex/contrib/multihead_attn
  copying apex/contrib/multihead_attn/self_multihead_attn.py -> build/lib.linux-x86_64-3.8/apex/contrib/multihead_attn
  copying apex/contrib/multihead_attn/fast_encdec_multihead_attn_func.py -> build/lib.linux-x86_64-3.8/apex/contrib/multihead_attn
  copying apex/contrib/multihead_attn/fast_self_multihead_attn_norm_add_func.py -> build/lib.linux-x86_64-3.8/apex/contrib/multihead_attn
  creating build/lib.linux-x86_64-3.8/apex/contrib/transducer
  copying apex/contrib/transducer/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/transducer
  copying apex/contrib/transducer/transducer.py -> build/lib.linux-x86_64-3.8/apex/contrib/transducer
  copying apex/contrib/transducer/_transducer_ref.py -> build/lib.linux-x86_64-3.8/apex/contrib/transducer
  creating build/lib.linux-x86_64-3.8/apex/contrib/conv_bias_relu
  copying apex/contrib/conv_bias_relu/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/conv_bias_relu
  copying apex/contrib/conv_bias_relu/conv_bias_relu.py -> build/lib.linux-x86_64-3.8/apex/contrib/conv_bias_relu
  creating build/lib.linux-x86_64-3.8/apex/contrib/bottleneck
  copying apex/contrib/bottleneck/bottleneck.py -> build/lib.linux-x86_64-3.8/apex/contrib/bottleneck
  copying apex/contrib/bottleneck/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/bottleneck
  copying apex/contrib/bottleneck/halo_exchangers.py -> build/lib.linux-x86_64-3.8/apex/contrib/bottleneck
  copying apex/contrib/bottleneck/test.py -> build/lib.linux-x86_64-3.8/apex/contrib/bottleneck
  creating build/lib.linux-x86_64-3.8/apex/contrib/xentropy
  copying apex/contrib/xentropy/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/xentropy
  copying apex/contrib/xentropy/softmax_xentropy.py -> build/lib.linux-x86_64-3.8/apex/contrib/xentropy
  creating build/lib.linux-x86_64-3.8/apex/contrib/focal_loss
  copying apex/contrib/focal_loss/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/focal_loss
  copying apex/contrib/focal_loss/focal_loss.py -> build/lib.linux-x86_64-3.8/apex/contrib/focal_loss
  creating build/lib.linux-x86_64-3.8/apex/contrib/test
  copying apex/contrib/test/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/test
  creating build/lib.linux-x86_64-3.8/apex/contrib/optimizers
  copying apex/contrib/optimizers/fused_adam.py -> build/lib.linux-x86_64-3.8/apex/contrib/optimizers
  copying apex/contrib/optimizers/fused_lamb.py -> build/lib.linux-x86_64-3.8/apex/contrib/optimizers
  copying apex/contrib/optimizers/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/optimizers
  copying apex/contrib/optimizers/fp16_optimizer.py -> build/lib.linux-x86_64-3.8/apex/contrib/optimizers
  copying apex/contrib/optimizers/distributed_fused_lamb.py -> build/lib.linux-x86_64-3.8/apex/contrib/optimizers
  copying apex/contrib/optimizers/fused_sgd.py -> build/lib.linux-x86_64-3.8/apex/contrib/optimizers
  copying apex/contrib/optimizers/distributed_fused_adam.py -> build/lib.linux-x86_64-3.8/apex/contrib/optimizers
  creating build/lib.linux-x86_64-3.8/apex/contrib/index_mul_2d
  copying apex/contrib/index_mul_2d/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/index_mul_2d
  copying apex/contrib/index_mul_2d/index_mul_2d.py -> build/lib.linux-x86_64-3.8/apex/contrib/index_mul_2d
  creating build/lib.linux-x86_64-3.8/apex/contrib/layer_norm
  copying apex/contrib/layer_norm/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/layer_norm
  copying apex/contrib/layer_norm/layer_norm.py -> build/lib.linux-x86_64-3.8/apex/contrib/layer_norm
  creating build/lib.linux-x86_64-3.8/apex/contrib/cudnn_gbn
  copying apex/contrib/cudnn_gbn/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/cudnn_gbn
  copying apex/contrib/cudnn_gbn/batch_norm.py -> build/lib.linux-x86_64-3.8/apex/contrib/cudnn_gbn
  creating build/lib.linux-x86_64-3.8/apex/contrib/fmha
  copying apex/contrib/fmha/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/fmha
  copying apex/contrib/fmha/fmha.py -> build/lib.linux-x86_64-3.8/apex/contrib/fmha
  creating build/lib.linux-x86_64-3.8/apex/contrib/groupbn
  copying apex/contrib/groupbn/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/groupbn
  copying apex/contrib/groupbn/batch_norm.py -> build/lib.linux-x86_64-3.8/apex/contrib/groupbn
  creating build/lib.linux-x86_64-3.8/apex/contrib/peer_memory
  copying apex/contrib/peer_memory/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/peer_memory
  copying apex/contrib/peer_memory/peer_memory.py -> build/lib.linux-x86_64-3.8/apex/contrib/peer_memory
  copying apex/contrib/peer_memory/peer_halo_exchanger_1d.py -> build/lib.linux-x86_64-3.8/apex/contrib/peer_memory
  creating build/lib.linux-x86_64-3.8/apex/contrib/sparsity/permutation_search_kernels
  copying apex/contrib/sparsity/permutation_search_kernels/exhaustive_search.py -> build/lib.linux-x86_64-3.8/apex/contrib/sparsity/permutation_search_kernels
  copying apex/contrib/sparsity/permutation_search_kernels/permutation_utilities.py -> build/lib.linux-x86_64-3.8/apex/contrib/sparsity/permutation_search_kernels
  copying apex/contrib/sparsity/permutation_search_kernels/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/sparsity/permutation_search_kernels
  copying apex/contrib/sparsity/permutation_search_kernels/channel_swap.py -> build/lib.linux-x86_64-3.8/apex/contrib/sparsity/permutation_search_kernels
  copying apex/contrib/sparsity/permutation_search_kernels/call_permutation_search_kernels.py -> build/lib.linux-x86_64-3.8/apex/contrib/sparsity/permutation_search_kernels
  creating build/lib.linux-x86_64-3.8/apex/contrib/test/clip_grad
  copying apex/contrib/test/clip_grad/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/clip_grad
  copying apex/contrib/test/clip_grad/test_clip_grad.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/clip_grad
  creating build/lib.linux-x86_64-3.8/apex/contrib/test/multihead_attn
  copying apex/contrib/test/multihead_attn/test_self_multihead_attn.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/multihead_attn
  copying apex/contrib/test/multihead_attn/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/multihead_attn
  copying apex/contrib/test/multihead_attn/test_fast_self_multihead_attn_bias.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/multihead_attn
  copying apex/contrib/test/multihead_attn/test_self_multihead_attn_norm_add.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/multihead_attn
  copying apex/contrib/test/multihead_attn/test_encdec_multihead_attn.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/multihead_attn
  copying apex/contrib/test/multihead_attn/test_mha_fused_softmax.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/multihead_attn
  copying apex/contrib/test/multihead_attn/test_encdec_multihead_attn_norm_add.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/multihead_attn
  creating build/lib.linux-x86_64-3.8/apex/contrib/test/transducer
  copying apex/contrib/test/transducer/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/transducer
  copying apex/contrib/test/transducer/test_transducer_loss.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/transducer
  copying apex/contrib/test/transducer/test_transducer_joint.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/transducer
  creating build/lib.linux-x86_64-3.8/apex/contrib/test/conv_bias_relu
  copying apex/contrib/test/conv_bias_relu/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/conv_bias_relu
  copying apex/contrib/test/conv_bias_relu/test_conv_bias_relu.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/conv_bias_relu
  creating build/lib.linux-x86_64-3.8/apex/contrib/test/bottleneck
  copying apex/contrib/test/bottleneck/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/bottleneck
  copying apex/contrib/test/bottleneck/test_bottleneck_module.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/bottleneck
  creating build/lib.linux-x86_64-3.8/apex/contrib/test/xentropy
  copying apex/contrib/test/xentropy/test_label_smoothing.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/xentropy
  copying apex/contrib/test/xentropy/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/xentropy
  creating build/lib.linux-x86_64-3.8/apex/contrib/test/focal_loss
  copying apex/contrib/test/focal_loss/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/focal_loss
  copying apex/contrib/test/focal_loss/test_focal_loss.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/focal_loss
  creating build/lib.linux-x86_64-3.8/apex/contrib/test/optimizers
  copying apex/contrib/test/optimizers/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/optimizers
  copying apex/contrib/test/optimizers/test_dist_adam.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/optimizers
  copying apex/contrib/test/optimizers/test_distributed_fused_lamb.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/optimizers
  creating build/lib.linux-x86_64-3.8/apex/contrib/test/index_mul_2d
  copying apex/contrib/test/index_mul_2d/test_index_mul_2d.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/index_mul_2d
  copying apex/contrib/test/index_mul_2d/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/index_mul_2d
  creating build/lib.linux-x86_64-3.8/apex/contrib/test/layer_norm
  copying apex/contrib/test/layer_norm/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/layer_norm
  copying apex/contrib/test/layer_norm/test_fast_layer_norm.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/layer_norm
  creating build/lib.linux-x86_64-3.8/apex/contrib/test/cudnn_gbn
  copying apex/contrib/test/cudnn_gbn/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/cudnn_gbn
  copying apex/contrib/test/cudnn_gbn/test_cudnn_gbn_with_two_gpus.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/cudnn_gbn
  creating build/lib.linux-x86_64-3.8/apex/contrib/test/fmha
  copying apex/contrib/test/fmha/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/fmha
  copying apex/contrib/test/fmha/test_fmha.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/fmha
  creating build/lib.linux-x86_64-3.8/apex/contrib/test/peer_memory
  copying apex/contrib/test/peer_memory/__init__.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/peer_memory
  copying apex/contrib/test/peer_memory/test_peer_halo_exchange_module.py -> build/lib.linux-x86_64-3.8/apex/contrib/test/peer_memory
  creating build/lib.linux-x86_64-3.8/apex/amp/lists
  copying apex/amp/lists/functional_overrides.py -> build/lib.linux-x86_64-3.8/apex/amp/lists
  copying apex/amp/lists/__init__.py -> build/lib.linux-x86_64-3.8/apex/amp/lists
  copying apex/amp/lists/tensor_overrides.py -> build/lib.linux-x86_64-3.8/apex/amp/lists
  copying apex/amp/lists/torch_overrides.py -> build/lib.linux-x86_64-3.8/apex/amp/lists
  creating build/lib.linux-x86_64-3.8/apex/transformer/layers
  copying apex/transformer/layers/__init__.py -> build/lib.linux-x86_64-3.8/apex/transformer/layers
  copying apex/transformer/layers/layer_norm.py -> build/lib.linux-x86_64-3.8/apex/transformer/layers
  creating build/lib.linux-x86_64-3.8/apex/transformer/amp
  copying apex/transformer/amp/__init__.py -> build/lib.linux-x86_64-3.8/apex/transformer/amp
  copying apex/transformer/amp/grad_scaler.py -> build/lib.linux-x86_64-3.8/apex/transformer/amp
  creating build/lib.linux-x86_64-3.8/apex/transformer/pipeline_parallel
  copying apex/transformer/pipeline_parallel/p2p_communication.py -> build/lib.linux-x86_64-3.8/apex/transformer/pipeline_parallel
  copying apex/transformer/pipeline_parallel/__init__.py -> build/lib.linux-x86_64-3.8/apex/transformer/pipeline_parallel
  copying apex/transformer/pipeline_parallel/utils.py -> build/lib.linux-x86_64-3.8/apex/transformer/pipeline_parallel
  copying apex/transformer/pipeline_parallel/_timers.py -> build/lib.linux-x86_64-3.8/apex/transformer/pipeline_parallel
  creating build/lib.linux-x86_64-3.8/apex/transformer/functional
  copying apex/transformer/functional/__init__.py -> build/lib.linux-x86_64-3.8/apex/transformer/functional
  copying apex/transformer/functional/fused_softmax.py -> build/lib.linux-x86_64-3.8/apex/transformer/functional
  creating build/lib.linux-x86_64-3.8/apex/transformer/_data
  copying apex/transformer/_data/__init__.py -> build/lib.linux-x86_64-3.8/apex/transformer/_data
  copying apex/transformer/_data/_batchsampler.py -> build/lib.linux-x86_64-3.8/apex/transformer/_data
  creating build/lib.linux-x86_64-3.8/apex/transformer/testing
  copying apex/transformer/testing/standalone_gpt.py -> build/lib.linux-x86_64-3.8/apex/transformer/testing
  copying apex/transformer/testing/__init__.py -> build/lib.linux-x86_64-3.8/apex/transformer/testing
  copying apex/transformer/testing/commons.py -> build/lib.linux-x86_64-3.8/apex/transformer/testing
  copying apex/transformer/testing/standalone_transformer_lm.py -> build/lib.linux-x86_64-3.8/apex/transformer/testing
  copying apex/transformer/testing/standalone_bert.py -> build/lib.linux-x86_64-3.8/apex/transformer/testing
  copying apex/transformer/testing/distributed_test_base.py -> build/lib.linux-x86_64-3.8/apex/transformer/testing
  copying apex/transformer/testing/global_vars.py -> build/lib.linux-x86_64-3.8/apex/transformer/testing
  copying apex/transformer/testing/arguments.py -> build/lib.linux-x86_64-3.8/apex/transformer/testing
  creating build/lib.linux-x86_64-3.8/apex/transformer/tensor_parallel
  copying apex/transformer/tensor_parallel/memory.py -> build/lib.linux-x86_64-3.8/apex/transformer/tensor_parallel
  copying apex/transformer/tensor_parallel/__init__.py -> build/lib.linux-x86_64-3.8/apex/transformer/tensor_parallel
  copying apex/transformer/tensor_parallel/data.py -> build/lib.linux-x86_64-3.8/apex/transformer/tensor_parallel
  copying apex/transformer/tensor_parallel/layers.py -> build/lib.linux-x86_64-3.8/apex/transformer/tensor_parallel
  copying apex/transformer/tensor_parallel/utils.py -> build/lib.linux-x86_64-3.8/apex/transformer/tensor_parallel
  copying apex/transformer/tensor_parallel/mappings.py -> build/lib.linux-x86_64-3.8/apex/transformer/tensor_parallel
  copying apex/transformer/tensor_parallel/cross_entropy.py -> build/lib.linux-x86_64-3.8/apex/transformer/tensor_parallel
  copying apex/transformer/tensor_parallel/random.py -> build/lib.linux-x86_64-3.8/apex/transformer/tensor_parallel
  creating build/lib.linux-x86_64-3.8/apex/transformer/pipeline_parallel/schedules
  copying apex/transformer/pipeline_parallel/schedules/__init__.py -> build/lib.linux-x86_64-3.8/apex/transformer/pipeline_parallel/schedules
  copying apex/transformer/pipeline_parallel/schedules/fwd_bwd_pipelining_without_interleaving.py -> build/lib.linux-x86_64-3.8/apex/transformer/pipeline_parallel/schedules
  copying apex/transformer/pipeline_parallel/schedules/fwd_bwd_no_pipelining.py -> build/lib.linux-x86_64-3.8/apex/transformer/pipeline_parallel/schedules
  copying apex/transformer/pipeline_parallel/schedules/common.py -> build/lib.linux-x86_64-3.8/apex/transformer/pipeline_parallel/schedules
  copying apex/transformer/pipeline_parallel/schedules/fwd_bwd_pipelining_with_interleaving.py -> build/lib.linux-x86_64-3.8/apex/transformer/pipeline_parallel/schedules
  running build_ext
  /home/celso/projects/venvs/LightXML/lib/python3.8/site-packages/torch/utils/cpp_extension.py:476: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
    warnings.warn(msg.format('we could not find ninja.'))
  building 'apex_C' extension
  creating build/temp.linux-x86_64-3.8
  creating build/temp.linux-x86_64-3.8/csrc
  x86_64-linux-gnu-gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/celso/projects/venvs/LightXML/lib/python3.8/site-packages/torch/include -I/home/celso/projects/venvs/LightXML/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/celso/projects/venvs/LightXML/lib/python3.8/site-packages/torch/include/TH -I/home/celso/projects/venvs/LightXML/lib/python3.8/site-packages/torch/include/THC -I/home/celso/projects/venvs/LightXML/include -I/usr/include/python3.8 -c csrc/flatten_unflatten.cpp -o build/temp.linux-x86_64-3.8/csrc/flatten_unflatten.o -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -DTORCH_EXTENSION_NAME=apex_C -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
  In file included from /home/celso/projects/venvs/LightXML/lib/python3.8/site-packages/torch/include/torch/csrc/Device.h:4,
                   from /home/celso/projects/venvs/LightXML/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/python.h:8,
                   from /home/celso/projects/venvs/LightXML/lib/python3.8/site-packages/torch/include/torch/extension.h:6,
                   from csrc/flatten_unflatten.cpp:1:
  /home/celso/projects/venvs/LightXML/lib/python3.8/site-packages/torch/include/torch/csrc/python_headers.h:12:10: fatal error: Python.h: No such file or directory
     12 | #include <Python.h>
        |          ^~~~~~~~~~
  compilation terminated.
  error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
  error: subprocess-exited-with-error

  × Running setup.py install for apex did not run successfully.
  │ exit code: 1
  ╰─> See above for output.

  note: This error originates from a subprocess, and is likely not a problem with pip.
  full command: /home/celso/projects/venvs/LightXML/bin/python -u -c '
  exec(compile('"'"''"'"''"'"'
  # This is <pip-setuptools-caller> -- a caller that pip uses to run setup.py
  #
  # - It imports setuptools before invoking setup.py, to enable projects that directly
  #   import from `distutils.core` to work with newer packaging standards.
  # - It provides a clear error message when setuptools is not installed.
  # - It sets `sys.argv[0]` to the underlying `setup.py`, when invoking `setup.py` so
  #   setuptools doesn'"'"'t think the script is `-c`. This avoids the following warning:
  #     manifest_maker: standard file '"'"'-c'"'"' not found".
  # - It generates a shim setup.py, for handling setup.cfg-only projects.
  import os, sys, tokenize

  try:
      import setuptools
  except ImportError as error:
      print(
          "ERROR: Can not execute `setup.py` since setuptools is not available in "
          "the build environment.",
          file=sys.stderr,
      )
      sys.exit(1)

  __file__ = %r
  sys.argv[0] = __file__

  if os.path.exists(__file__):
      filename = __file__
      with tokenize.open(__file__) as f:
          setup_py_code = f.read()
  else:
      filename = "<auto-generated setuptools caller>"
      setup_py_code = "from setuptools import setup; setup()"

  exec(compile(setup_py_code, filename, "exec"))
  '"'"''"'"''"'"' % ('"'"'/home/celso/projects/LightXML/apex/setup.py'"'"',), "<pip-setuptools-caller>", "exec"))' --cpp_ext --cuda_ext install --record /tmp/pip-record-dr0rye0l/install-record.txt --single-version-externally-managed --compile --install-headers /home/celso/projects/venvs/LightXML/include/site/python3.8/apex
  cwd: /home/celso/projects/LightXML/apex/
  Running setup.py install for apex ... error
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> apex

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.
crcrpar commented 1 year ago
  /home/celso/projects/venvs/LightXML/lib/python3.8/site-packages/torch/include/torch/csrc/python_headers.h:12:10: fatal error: Python.h: No such file or directory
     12 | #include <Python.h>
        |          ^~~~~~~~~~
  compilation terminated.
  error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
  error: subprocess-exited-with-error

could you confirm whether this header file is available in your environment and also the directory of this file is visible during compile?

celsofranssa commented 1 year ago
  /home/celso/projects/venvs/LightXML/lib/python3.8/site-packages/torch/include/torch/csrc/python_headers.h:12:10: fatal error: Python.h: No such file or directory
     12 | #include <Python.h>
        |          ^~~~~~~~~~
  compilation terminated.
  error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
  error: subprocess-exited-with-error

could you confirm whether this header file is available in your environment and also the directory of this file is visible during compile?

I was able to install apex after re-install cuda using the using Package Manager Installation (instead through runfile) and re-install python with dev tag (sudo apt-get install python3.x-dev).

hubutui commented 1 year ago

You could find Python.h in /home/celso/projects/venvs/LightXML? sudo apt-get install python3.x-dev is system-wide, not for your python virtual env.