NVIDIA / apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
BSD 3-Clause "New" or "Revised" License
8.42k stars 1.4k forks source link

Installation instructions don't build/install the C modules #1763

Open zxti opened 11 months ago

zxti commented 11 months ago

Describe the Bug

Minimal Steps/Code to Reproduce the Bug

git clone https://github.com/NVIDIA/apex
cd apex
# if pip >= 23.1 (ref: https://pip.pypa.io/en/stable/news/#v23-1) which supports multiple `--config-settings` with the same key... 
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./

Expected Behavior The CUDA extension native modules specified in setup.py like amp_C, etc. flash_attn_2_cuda, fused_layer_norm_cuda, etc. should all be built and installed.

Actual Behavior C modules are not build or installed, only see:

root@ebd430064a0b:/tmp/pip-req-build-brdy43s6# pip install --no-clean -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings --build-option=--cpp_ext --config-settings --build-option=--cuda_ext .
Using pip 23.3.2 from /usr/local/lib/python3.8/dist-packages/pip (python 3.8)
Processing /tmp/pip-req-build-brdy43s6
  Running command Preparing metadata (pyproject.toml)

  torch.__version__  = 1.13.0+cu117

  ! ['/usr/local/lib/python3.8/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py', 'dist_info', '--egg-base', '/tmp/pip-modern-metadata-j1gg4op9']
  running dist_info
  creating /tmp/pip-modern-metadata-j1gg4op9/apex.egg-info
  writing /tmp/pip-modern-metadata-j1gg4op9/apex.egg-info/PKG-INFO
  writing dependency_links to /tmp/pip-modern-metadata-j1gg4op9/apex.egg-info/dependency_links.txt
  writing requirements to /tmp/pip-modern-metadata-j1gg4op9/apex.egg-info/requires.txt
  writing top-level names to /tmp/pip-modern-metadata-j1gg4op9/apex.egg-info/top_level.txt
  writing manifest file '/tmp/pip-modern-metadata-j1gg4op9/apex.egg-info/SOURCES.txt'
  reading manifest file '/tmp/pip-modern-metadata-j1gg4op9/apex.egg-info/SOURCES.txt'
  writing manifest file '/tmp/pip-modern-metadata-j1gg4op9/apex.egg-info/SOURCES.txt'
  creating '/tmp/pip-modern-metadata-j1gg4op9/apex.dist-info'
  adding license file "LICENSE" (matched pattern "LICEN[CS]E*")
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: packaging>20.6 in /usr/local/lib/python3.8/dist-packages (from apex==0.1) (23.2)
Building wheels for collected packages: apex
  Running command Building wheel for apex (pyproject.toml)

  torch.__version__  = 1.13.0+cu117

  ! ['/usr/local/lib/python3.8/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py', 'bdist_wheel', '--dist-dir', '/tmp/pip-wheel-z6ommft2/tmpu15x04qw']
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib
  creating build/lib/apex
  copying apex/__init__.py -> build/lib/apex
  copying apex/_autocast_utils.py -> build/lib/apex
  creating build/lib/apex/normalization
  copying apex/normalization/__init__.py -> build/lib/apex/normalization
  copying apex/normalization/fused_layer_norm.py -> build/lib/apex/normalization
  creating build/lib/apex/fp16_utils
  copying apex/fp16_utils/loss_scaler.py -> build/lib/apex/fp16_utils
  copying apex/fp16_utils/fp16util.py -> build/lib/apex/fp16_utils
  copying apex/fp16_utils/__init__.py -> build/lib/apex/fp16_utils
  copying apex/fp16_utils/fp16_optimizer.py -> build/lib/apex/fp16_utils
  creating build/lib/apex/amp
  copying apex/amp/wrap.py -> build/lib/apex/amp
  copying apex/amp/frontend.py -> build/lib/apex/amp
  copying apex/amp/handle.py -> build/lib/apex/amp
  copying apex/amp/__init__.py -> build/lib/apex/amp
  copying apex/amp/_amp_state.py -> build/lib/apex/amp
  copying apex/amp/opt.py -> build/lib/apex/amp
  copying apex/amp/compat.py -> build/lib/apex/amp
  copying apex/amp/__version__.py -> build/lib/apex/amp
  copying apex/amp/utils.py -> build/lib/apex/amp
  copying apex/amp/_initialize.py -> build/lib/apex/amp
  copying apex/amp/_process_optimizer.py -> build/lib/apex/amp
  copying apex/amp/amp.py -> build/lib/apex/amp
  copying apex/amp/rnn_compat.py -> build/lib/apex/amp
  copying apex/amp/scaler.py -> build/lib/apex/amp
  creating build/lib/apex/optimizers
  copying apex/optimizers/fused_adagrad.py -> build/lib/apex/optimizers
  copying apex/optimizers/fused_lamb.py -> build/lib/apex/optimizers
  copying apex/optimizers/fused_sgd.py -> build/lib/apex/optimizers
  copying apex/optimizers/__init__.py -> build/lib/apex/optimizers
  copying apex/optimizers/fused_adam.py -> build/lib/apex/optimizers
  copying apex/optimizers/fused_mixed_precision_lamb.py -> build/lib/apex/optimizers
  copying apex/optimizers/fused_novograd.py -> build/lib/apex/optimizers
  creating build/lib/apex/parallel
  copying apex/parallel/optimized_sync_batchnorm.py -> build/lib/apex/parallel
  copying apex/parallel/sync_batchnorm.py -> build/lib/apex/parallel
  copying apex/parallel/distributed.py -> build/lib/apex/parallel
  copying apex/parallel/__init__.py -> build/lib/apex/parallel
  copying apex/parallel/sync_batchnorm_kernel.py -> build/lib/apex/parallel
  copying apex/parallel/optimized_sync_batchnorm_kernel.py -> build/lib/apex/parallel
  copying apex/parallel/multiproc.py -> build/lib/apex/parallel
  copying apex/parallel/LARC.py -> build/lib/apex/parallel
  creating build/lib/apex/multi_tensor_apply
  copying apex/multi_tensor_apply/__init__.py -> build/lib/apex/multi_tensor_apply
  copying apex/multi_tensor_apply/multi_tensor_apply.py -> build/lib/apex/multi_tensor_apply
  creating build/lib/apex/fused_dense
  copying apex/fused_dense/__init__.py -> build/lib/apex/fused_dense
  copying apex/fused_dense/fused_dense.py -> build/lib/apex/fused_dense
  creating build/lib/apex/mlp
  copying apex/mlp/mlp.py -> build/lib/apex/mlp
  copying apex/mlp/__init__.py -> build/lib/apex/mlp
  creating build/lib/apex/contrib
  copying apex/contrib/__init__.py -> build/lib/apex/contrib
  creating build/lib/apex/RNN
  copying apex/RNN/models.py -> build/lib/apex/RNN
  copying apex/RNN/RNNBackend.py -> build/lib/apex/RNN
  copying apex/RNN/__init__.py -> build/lib/apex/RNN
  copying apex/RNN/cells.py -> build/lib/apex/RNN
  creating build/lib/apex/transformer
  copying apex/transformer/microbatches.py -> build/lib/apex/transformer
  copying apex/transformer/log_util.py -> build/lib/apex/transformer
  copying apex/transformer/__init__.py -> build/lib/apex/transformer
  copying apex/transformer/parallel_state.py -> build/lib/apex/transformer
  copying apex/transformer/utils.py -> build/lib/apex/transformer
  copying apex/transformer/_ucc_util.py -> build/lib/apex/transformer
  copying apex/transformer/enums.py -> build/lib/apex/transformer
  creating build/lib/apex/amp/lists
  copying apex/amp/lists/tensor_overrides.py -> build/lib/apex/amp/lists
  copying apex/amp/lists/torch_overrides.py -> build/lib/apex/amp/lists
  copying apex/amp/lists/functional_overrides.py -> build/lib/apex/amp/lists
  copying apex/amp/lists/__init__.py -> build/lib/apex/amp/lists
  creating build/lib/apex/contrib/xentropy
  copying apex/contrib/xentropy/__init__.py -> build/lib/apex/contrib/xentropy
  copying apex/contrib/xentropy/softmax_xentropy.py -> build/lib/apex/contrib/xentropy
  creating build/lib/apex/contrib/clip_grad
  copying apex/contrib/clip_grad/clip_grad.py -> build/lib/apex/contrib/clip_grad
  copying apex/contrib/clip_grad/__init__.py -> build/lib/apex/contrib/clip_grad
  creating build/lib/apex/contrib/transducer
  copying apex/contrib/transducer/_transducer_ref.py -> build/lib/apex/contrib/transducer
  copying apex/contrib/transducer/__init__.py -> build/lib/apex/contrib/transducer
  copying apex/contrib/transducer/transducer.py -> build/lib/apex/contrib/transducer
  creating build/lib/apex/contrib/test
  copying apex/contrib/test/__init__.py -> build/lib/apex/contrib/test
  creating build/lib/apex/contrib/cudnn_gbn
  copying apex/contrib/cudnn_gbn/batch_norm.py -> build/lib/apex/contrib/cudnn_gbn
  copying apex/contrib/cudnn_gbn/__init__.py -> build/lib/apex/contrib/cudnn_gbn
  creating build/lib/apex/contrib/conv_bias_relu
  copying apex/contrib/conv_bias_relu/conv_bias_relu.py -> build/lib/apex/contrib/conv_bias_relu
  copying apex/contrib/conv_bias_relu/__init__.py -> build/lib/apex/contrib/conv_bias_relu
  creating build/lib/apex/contrib/fmha
  copying apex/contrib/fmha/__init__.py -> build/lib/apex/contrib/fmha
  copying apex/contrib/fmha/fmha.py -> build/lib/apex/contrib/fmha
  creating build/lib/apex/contrib/optimizers
  copying apex/contrib/optimizers/fused_lamb.py -> build/lib/apex/contrib/optimizers
  copying apex/contrib/optimizers/fused_sgd.py -> build/lib/apex/contrib/optimizers
  copying apex/contrib/optimizers/__init__.py -> build/lib/apex/contrib/optimizers
  copying apex/contrib/optimizers/fused_adam.py -> build/lib/apex/contrib/optimizers
  copying apex/contrib/optimizers/fp16_optimizer.py -> build/lib/apex/contrib/optimizers
  copying apex/contrib/optimizers/distributed_fused_lamb.py -> build/lib/apex/contrib/optimizers
  copying apex/contrib/optimizers/distributed_fused_adam.py -> build/lib/apex/contrib/optimizers
  creating build/lib/apex/contrib/bottleneck
  copying apex/contrib/bottleneck/bottleneck.py -> build/lib/apex/contrib/bottleneck
  copying apex/contrib/bottleneck/test.py -> build/lib/apex/contrib/bottleneck
  copying apex/contrib/bottleneck/__init__.py -> build/lib/apex/contrib/bottleneck
  copying apex/contrib/bottleneck/halo_exchangers.py -> build/lib/apex/contrib/bottleneck
  creating build/lib/apex/contrib/index_mul_2d
  copying apex/contrib/index_mul_2d/__init__.py -> build/lib/apex/contrib/index_mul_2d
  copying apex/contrib/index_mul_2d/index_mul_2d.py -> build/lib/apex/contrib/index_mul_2d
  creating build/lib/apex/contrib/sparsity
  copying apex/contrib/sparsity/sparse_masklib.py -> build/lib/apex/contrib/sparsity
  copying apex/contrib/sparsity/permutation_lib.py -> build/lib/apex/contrib/sparsity
  copying apex/contrib/sparsity/__init__.py -> build/lib/apex/contrib/sparsity
  copying apex/contrib/sparsity/asp.py -> build/lib/apex/contrib/sparsity
  creating build/lib/apex/contrib/groupbn
  copying apex/contrib/groupbn/batch_norm.py -> build/lib/apex/contrib/groupbn
  copying apex/contrib/groupbn/__init__.py -> build/lib/apex/contrib/groupbn
  creating build/lib/apex/contrib/layer_norm
  copying apex/contrib/layer_norm/__init__.py -> build/lib/apex/contrib/layer_norm
  copying apex/contrib/layer_norm/layer_norm.py -> build/lib/apex/contrib/layer_norm
  creating build/lib/apex/contrib/peer_memory
  copying apex/contrib/peer_memory/peer_halo_exchanger_1d.py -> build/lib/apex/contrib/peer_memory
  copying apex/contrib/peer_memory/peer_memory.py -> build/lib/apex/contrib/peer_memory
  copying apex/contrib/peer_memory/__init__.py -> build/lib/apex/contrib/peer_memory
  creating build/lib/apex/contrib/multihead_attn
  copying apex/contrib/multihead_attn/self_multihead_attn_func.py -> build/lib/apex/contrib/multihead_attn
  copying apex/contrib/multihead_attn/fast_self_multihead_attn_norm_add_func.py -> build/lib/apex/contrib/multihead_attn
  copying apex/contrib/multihead_attn/fast_encdec_multihead_attn_norm_add_func.py -> build/lib/apex/contrib/multihead_attn
  copying apex/contrib/multihead_attn/encdec_multihead_attn_func.py -> build/lib/apex/contrib/multihead_attn
  copying apex/contrib/multihead_attn/fast_self_multihead_attn_func.py -> build/lib/apex/contrib/multihead_attn
  copying apex/contrib/multihead_attn/__init__.py -> build/lib/apex/contrib/multihead_attn
  copying apex/contrib/multihead_attn/mask_softmax_dropout_func.py -> build/lib/apex/contrib/multihead_attn
  copying apex/contrib/multihead_attn/fast_encdec_multihead_attn_func.py -> build/lib/apex/contrib/multihead_attn
  copying apex/contrib/multihead_attn/encdec_multihead_attn.py -> build/lib/apex/contrib/multihead_attn
  copying apex/contrib/multihead_attn/self_multihead_attn.py -> build/lib/apex/contrib/multihead_attn
  creating build/lib/apex/contrib/focal_loss
  copying apex/contrib/focal_loss/__init__.py -> build/lib/apex/contrib/focal_loss
  copying apex/contrib/focal_loss/focal_loss.py -> build/lib/apex/contrib/focal_loss
  creating build/lib/apex/contrib/group_norm
  copying apex/contrib/group_norm/group_norm.py -> build/lib/apex/contrib/group_norm
  copying apex/contrib/group_norm/__init__.py -> build/lib/apex/contrib/group_norm
  creating build/lib/apex/contrib/test/xentropy
  copying apex/contrib/test/xentropy/test_label_smoothing.py -> build/lib/apex/contrib/test/xentropy
  copying apex/contrib/test/xentropy/__init__.py -> build/lib/apex/contrib/test/xentropy
  creating build/lib/apex/contrib/test/clip_grad
  copying apex/contrib/test/clip_grad/test_clip_grad.py -> build/lib/apex/contrib/test/clip_grad
  copying apex/contrib/test/clip_grad/__init__.py -> build/lib/apex/contrib/test/clip_grad
  creating build/lib/apex/contrib/test/transducer
  copying apex/contrib/test/transducer/test_transducer_loss.py -> build/lib/apex/contrib/test/transducer
  copying apex/contrib/test/transducer/__init__.py -> build/lib/apex/contrib/test/transducer
  copying apex/contrib/test/transducer/test_transducer_joint.py -> build/lib/apex/contrib/test/transducer
  creating build/lib/apex/contrib/test/cudnn_gbn
  copying apex/contrib/test/cudnn_gbn/__init__.py -> build/lib/apex/contrib/test/cudnn_gbn
  copying apex/contrib/test/cudnn_gbn/test_cudnn_gbn_with_two_gpus.py -> build/lib/apex/contrib/test/cudnn_gbn
  creating build/lib/apex/contrib/test/conv_bias_relu
  copying apex/contrib/test/conv_bias_relu/test_conv_bias_relu.py -> build/lib/apex/contrib/test/conv_bias_relu
  copying apex/contrib/test/conv_bias_relu/__init__.py -> build/lib/apex/contrib/test/conv_bias_relu
  creating build/lib/apex/contrib/test/fmha
  copying apex/contrib/test/fmha/test_fmha.py -> build/lib/apex/contrib/test/fmha
  copying apex/contrib/test/fmha/__init__.py -> build/lib/apex/contrib/test/fmha
  creating build/lib/apex/contrib/test/optimizers
  copying apex/contrib/test/optimizers/__init__.py -> build/lib/apex/contrib/test/optimizers
  copying apex/contrib/test/optimizers/test_dist_adam.py -> build/lib/apex/contrib/test/optimizers
  copying apex/contrib/test/optimizers/test_distributed_fused_lamb.py -> build/lib/apex/contrib/test/optimizers
  creating build/lib/apex/contrib/test/bottleneck
  copying apex/contrib/test/bottleneck/__init__.py -> build/lib/apex/contrib/test/bottleneck
  copying apex/contrib/test/bottleneck/test_bottleneck_module.py -> build/lib/apex/contrib/test/bottleneck
  creating build/lib/apex/contrib/test/index_mul_2d
  copying apex/contrib/test/index_mul_2d/__init__.py -> build/lib/apex/contrib/test/index_mul_2d
  copying apex/contrib/test/index_mul_2d/test_index_mul_2d.py -> build/lib/apex/contrib/test/index_mul_2d
  creating build/lib/apex/contrib/test/layer_norm
  copying apex/contrib/test/layer_norm/__init__.py -> build/lib/apex/contrib/test/layer_norm
  copying apex/contrib/test/layer_norm/test_fast_layer_norm.py -> build/lib/apex/contrib/test/layer_norm
  creating build/lib/apex/contrib/test/peer_memory
  copying apex/contrib/test/peer_memory/test_peer_halo_exchange_module.py -> build/lib/apex/contrib/test/peer_memory
  copying apex/contrib/test/peer_memory/__init__.py -> build/lib/apex/contrib/test/peer_memory
  creating build/lib/apex/contrib/test/multihead_attn
  copying apex/contrib/test/multihead_attn/test_mha_fused_softmax.py -> build/lib/apex/contrib/test/multihead_attn
  copying apex/contrib/test/multihead_attn/test_encdec_multihead_attn_norm_add.py -> build/lib/apex/contrib/test/multihead_attn
  copying apex/contrib/test/multihead_attn/test_self_multihead_attn.py -> build/lib/apex/contrib/test/multihead_attn
  copying apex/contrib/test/multihead_attn/test_self_multihead_attn_norm_add.py -> build/lib/apex/contrib/test/multihead_attn
  copying apex/contrib/test/multihead_attn/test_encdec_multihead_attn.py -> build/lib/apex/contrib/test/multihead_attn
  copying apex/contrib/test/multihead_attn/test_fast_self_multihead_attn_bias.py -> build/lib/apex/contrib/test/multihead_attn
  copying apex/contrib/test/multihead_attn/__init__.py -> build/lib/apex/contrib/test/multihead_attn
  creating build/lib/apex/contrib/test/focal_loss
  copying apex/contrib/test/focal_loss/test_focal_loss.py -> build/lib/apex/contrib/test/focal_loss
  copying apex/contrib/test/focal_loss/__init__.py -> build/lib/apex/contrib/test/focal_loss
  creating build/lib/apex/contrib/test/group_norm
  copying apex/contrib/test/group_norm/test_group_norm.py -> build/lib/apex/contrib/test/group_norm
  copying apex/contrib/test/group_norm/__init__.py -> build/lib/apex/contrib/test/group_norm
  creating build/lib/apex/contrib/sparsity/permutation_search_kernels
  copying apex/contrib/sparsity/permutation_search_kernels/call_permutation_search_kernels.py -> build/lib/apex/contrib/sparsity/permutation_search_kernels
  copying apex/contrib/sparsity/permutation_search_kernels/__init__.py -> build/lib/apex/contrib/sparsity/permutation_search_kernels
  copying apex/contrib/sparsity/permutation_search_kernels/permutation_utilities.py -> build/lib/apex/contrib/sparsity/permutation_search_kernels
  copying apex/contrib/sparsity/permutation_search_kernels/channel_swap.py -> build/lib/apex/contrib/sparsity/permutation_search_kernels
  copying apex/contrib/sparsity/permutation_search_kernels/exhaustive_search.py -> build/lib/apex/contrib/sparsity/permutation_search_kernels
  creating build/lib/apex/transformer/functional
  copying apex/transformer/functional/fused_softmax.py -> build/lib/apex/transformer/functional
  copying apex/transformer/functional/__init__.py -> build/lib/apex/transformer/functional
  creating build/lib/apex/transformer/tensor_parallel
  copying apex/transformer/tensor_parallel/random.py -> build/lib/apex/transformer/tensor_parallel
  copying apex/transformer/tensor_parallel/data.py -> build/lib/apex/transformer/tensor_parallel
  copying apex/transformer/tensor_parallel/__init__.py -> build/lib/apex/transformer/tensor_parallel
  copying apex/transformer/tensor_parallel/layers.py -> build/lib/apex/transformer/tensor_parallel
  copying apex/transformer/tensor_parallel/mappings.py -> build/lib/apex/transformer/tensor_parallel
  copying apex/transformer/tensor_parallel/utils.py -> build/lib/apex/transformer/tensor_parallel
  copying apex/transformer/tensor_parallel/cross_entropy.py -> build/lib/apex/transformer/tensor_parallel
  copying apex/transformer/tensor_parallel/memory.py -> build/lib/apex/transformer/tensor_parallel
  creating build/lib/apex/transformer/amp
  copying apex/transformer/amp/__init__.py -> build/lib/apex/transformer/amp
  copying apex/transformer/amp/grad_scaler.py -> build/lib/apex/transformer/amp
  creating build/lib/apex/transformer/_data
  copying apex/transformer/_data/__init__.py -> build/lib/apex/transformer/_data
  copying apex/transformer/_data/_batchsampler.py -> build/lib/apex/transformer/_data
  creating build/lib/apex/transformer/pipeline_parallel
  copying apex/transformer/pipeline_parallel/_timers.py -> build/lib/apex/transformer/pipeline_parallel
  copying apex/transformer/pipeline_parallel/p2p_communication.py -> build/lib/apex/transformer/pipeline_parallel
  copying apex/transformer/pipeline_parallel/__init__.py -> build/lib/apex/transformer/pipeline_parallel
  copying apex/transformer/pipeline_parallel/utils.py -> build/lib/apex/transformer/pipeline_parallel
  creating build/lib/apex/transformer/testing
  copying apex/transformer/testing/commons.py -> build/lib/apex/transformer/testing
  copying apex/transformer/testing/standalone_gpt.py -> build/lib/apex/transformer/testing
  copying apex/transformer/testing/standalone_bert.py -> build/lib/apex/transformer/testing
  copying apex/transformer/testing/__init__.py -> build/lib/apex/transformer/testing
  copying apex/transformer/testing/standalone_transformer_lm.py -> build/lib/apex/transformer/testing
  copying apex/transformer/testing/distributed_test_base.py -> build/lib/apex/transformer/testing
  copying apex/transformer/testing/global_vars.py -> build/lib/apex/transformer/testing
  copying apex/transformer/testing/arguments.py -> build/lib/apex/transformer/testing
  creating build/lib/apex/transformer/layers
  copying apex/transformer/layers/__init__.py -> build/lib/apex/transformer/layers
  copying apex/transformer/layers/layer_norm.py -> build/lib/apex/transformer/layers
  creating build/lib/apex/transformer/pipeline_parallel/schedules
  copying apex/transformer/pipeline_parallel/schedules/__init__.py -> build/lib/apex/transformer/pipeline_parallel/schedules
  copying apex/transformer/pipeline_parallel/schedules/fwd_bwd_pipelining_with_interleaving.py -> build/lib/apex/transformer/pipeline_parallel/schedules
  copying apex/transformer/pipeline_parallel/schedules/fwd_bwd_pipelining_without_interleaving.py -> build/lib/apex/transformer/pipeline_parallel/schedules
  copying apex/transformer/pipeline_parallel/schedules/common.py -> build/lib/apex/transformer/pipeline_parallel/schedules
  copying apex/transformer/pipeline_parallel/schedules/fwd_bwd_no_pipelining.py -> build/lib/apex/transformer/pipeline_parallel/schedules
  installing to build/bdist.linux-x86_64/wheel
  running install
  running install_lib
  creating build/bdist.linux-x86_64
  creating build/bdist.linux-x86_64/wheel
  creating build/bdist.linux-x86_64/wheel/apex
  creating build/bdist.linux-x86_64/wheel/apex/normalization
  copying build/lib/apex/normalization/__init__.py -> build/bdist.linux-x86_64/wheel/apex/normalization
  copying build/lib/apex/normalization/fused_layer_norm.py -> build/bdist.linux-x86_64/wheel/apex/normalization
  creating build/bdist.linux-x86_64/wheel/apex/fp16_utils
  copying build/lib/apex/fp16_utils/loss_scaler.py -> build/bdist.linux-x86_64/wheel/apex/fp16_utils
  copying build/lib/apex/fp16_utils/fp16util.py -> build/bdist.linux-x86_64/wheel/apex/fp16_utils
  copying build/lib/apex/fp16_utils/__init__.py -> build/bdist.linux-x86_64/wheel/apex/fp16_utils
  copying build/lib/apex/fp16_utils/fp16_optimizer.py -> build/bdist.linux-x86_64/wheel/apex/fp16_utils
  creating build/bdist.linux-x86_64/wheel/apex/amp
  copying build/lib/apex/amp/wrap.py -> build/bdist.linux-x86_64/wheel/apex/amp
  copying build/lib/apex/amp/frontend.py -> build/bdist.linux-x86_64/wheel/apex/amp
  copying build/lib/apex/amp/handle.py -> build/bdist.linux-x86_64/wheel/apex/amp
  copying build/lib/apex/amp/__init__.py -> build/bdist.linux-x86_64/wheel/apex/amp
  copying build/lib/apex/amp/_amp_state.py -> build/bdist.linux-x86_64/wheel/apex/amp
  copying build/lib/apex/amp/opt.py -> build/bdist.linux-x86_64/wheel/apex/amp
  copying build/lib/apex/amp/compat.py -> build/bdist.linux-x86_64/wheel/apex/amp
  copying build/lib/apex/amp/__version__.py -> build/bdist.linux-x86_64/wheel/apex/amp
  copying build/lib/apex/amp/utils.py -> build/bdist.linux-x86_64/wheel/apex/amp
  creating build/bdist.linux-x86_64/wheel/apex/amp/lists
  copying build/lib/apex/amp/lists/tensor_overrides.py -> build/bdist.linux-x86_64/wheel/apex/amp/lists
  copying build/lib/apex/amp/lists/torch_overrides.py -> build/bdist.linux-x86_64/wheel/apex/amp/lists
  copying build/lib/apex/amp/lists/functional_overrides.py -> build/bdist.linux-x86_64/wheel/apex/amp/lists
  copying build/lib/apex/amp/lists/__init__.py -> build/bdist.linux-x86_64/wheel/apex/amp/lists
  copying build/lib/apex/amp/_initialize.py -> build/bdist.linux-x86_64/wheel/apex/amp
  copying build/lib/apex/amp/_process_optimizer.py -> build/bdist.linux-x86_64/wheel/apex/amp
  copying build/lib/apex/amp/amp.py -> build/bdist.linux-x86_64/wheel/apex/amp
  copying build/lib/apex/amp/rnn_compat.py -> build/bdist.linux-x86_64/wheel/apex/amp
  copying build/lib/apex/amp/scaler.py -> build/bdist.linux-x86_64/wheel/apex/amp
  creating build/bdist.linux-x86_64/wheel/apex/optimizers
  copying build/lib/apex/optimizers/fused_adagrad.py -> build/bdist.linux-x86_64/wheel/apex/optimizers
  copying build/lib/apex/optimizers/fused_lamb.py -> build/bdist.linux-x86_64/wheel/apex/optimizers
  copying build/lib/apex/optimizers/fused_sgd.py -> build/bdist.linux-x86_64/wheel/apex/optimizers
  copying build/lib/apex/optimizers/__init__.py -> build/bdist.linux-x86_64/wheel/apex/optimizers
  copying build/lib/apex/optimizers/fused_adam.py -> build/bdist.linux-x86_64/wheel/apex/optimizers
  copying build/lib/apex/optimizers/fused_mixed_precision_lamb.py -> build/bdist.linux-x86_64/wheel/apex/optimizers
  copying build/lib/apex/optimizers/fused_novograd.py -> build/bdist.linux-x86_64/wheel/apex/optimizers
  copying build/lib/apex/__init__.py -> build/bdist.linux-x86_64/wheel/apex
  creating build/bdist.linux-x86_64/wheel/apex/parallel
  copying build/lib/apex/parallel/optimized_sync_batchnorm.py -> build/bdist.linux-x86_64/wheel/apex/parallel
  copying build/lib/apex/parallel/sync_batchnorm.py -> build/bdist.linux-x86_64/wheel/apex/parallel
  copying build/lib/apex/parallel/distributed.py -> build/bdist.linux-x86_64/wheel/apex/parallel
  copying build/lib/apex/parallel/__init__.py -> build/bdist.linux-x86_64/wheel/apex/parallel
  copying build/lib/apex/parallel/sync_batchnorm_kernel.py -> build/bdist.linux-x86_64/wheel/apex/parallel
  copying build/lib/apex/parallel/optimized_sync_batchnorm_kernel.py -> build/bdist.linux-x86_64/wheel/apex/parallel
  copying build/lib/apex/parallel/multiproc.py -> build/bdist.linux-x86_64/wheel/apex/parallel
  copying build/lib/apex/parallel/LARC.py -> build/bdist.linux-x86_64/wheel/apex/parallel
  creating build/bdist.linux-x86_64/wheel/apex/multi_tensor_apply
  copying build/lib/apex/multi_tensor_apply/__init__.py -> build/bdist.linux-x86_64/wheel/apex/multi_tensor_apply
  copying build/lib/apex/multi_tensor_apply/multi_tensor_apply.py -> build/bdist.linux-x86_64/wheel/apex/multi_tensor_apply
  creating build/bdist.linux-x86_64/wheel/apex/fused_dense
  copying build/lib/apex/fused_dense/__init__.py -> build/bdist.linux-x86_64/wheel/apex/fused_dense
  copying build/lib/apex/fused_dense/fused_dense.py -> build/bdist.linux-x86_64/wheel/apex/fused_dense
  creating build/bdist.linux-x86_64/wheel/apex/mlp
  copying build/lib/apex/mlp/mlp.py -> build/bdist.linux-x86_64/wheel/apex/mlp
  copying build/lib/apex/mlp/__init__.py -> build/bdist.linux-x86_64/wheel/apex/mlp
  creating build/bdist.linux-x86_64/wheel/apex/contrib
  creating build/bdist.linux-x86_64/wheel/apex/contrib/xentropy
  copying build/lib/apex/contrib/xentropy/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/xentropy
  copying build/lib/apex/contrib/xentropy/softmax_xentropy.py -> build/bdist.linux-x86_64/wheel/apex/contrib/xentropy
  creating build/bdist.linux-x86_64/wheel/apex/contrib/clip_grad
  copying build/lib/apex/contrib/clip_grad/clip_grad.py -> build/bdist.linux-x86_64/wheel/apex/contrib/clip_grad
  copying build/lib/apex/contrib/clip_grad/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/clip_grad
  creating build/bdist.linux-x86_64/wheel/apex/contrib/transducer
  copying build/lib/apex/contrib/transducer/_transducer_ref.py -> build/bdist.linux-x86_64/wheel/apex/contrib/transducer
  copying build/lib/apex/contrib/transducer/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/transducer
  copying build/lib/apex/contrib/transducer/transducer.py -> build/bdist.linux-x86_64/wheel/apex/contrib/transducer
  creating build/bdist.linux-x86_64/wheel/apex/contrib/test
  creating build/bdist.linux-x86_64/wheel/apex/contrib/test/xentropy
  copying build/lib/apex/contrib/test/xentropy/test_label_smoothing.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/xentropy
  copying build/lib/apex/contrib/test/xentropy/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/xentropy
  creating build/bdist.linux-x86_64/wheel/apex/contrib/test/clip_grad
  copying build/lib/apex/contrib/test/clip_grad/test_clip_grad.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/clip_grad
  copying build/lib/apex/contrib/test/clip_grad/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/clip_grad
  creating build/bdist.linux-x86_64/wheel/apex/contrib/test/transducer
  copying build/lib/apex/contrib/test/transducer/test_transducer_loss.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/transducer
  copying build/lib/apex/contrib/test/transducer/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/transducer
  copying build/lib/apex/contrib/test/transducer/test_transducer_joint.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/transducer
  creating build/bdist.linux-x86_64/wheel/apex/contrib/test/cudnn_gbn
  copying build/lib/apex/contrib/test/cudnn_gbn/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/cudnn_gbn
  copying build/lib/apex/contrib/test/cudnn_gbn/test_cudnn_gbn_with_two_gpus.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/cudnn_gbn
  creating build/bdist.linux-x86_64/wheel/apex/contrib/test/conv_bias_relu
  copying build/lib/apex/contrib/test/conv_bias_relu/test_conv_bias_relu.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/conv_bias_relu
  copying build/lib/apex/contrib/test/conv_bias_relu/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/conv_bias_relu
  creating build/bdist.linux-x86_64/wheel/apex/contrib/test/fmha
  copying build/lib/apex/contrib/test/fmha/test_fmha.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/fmha
  copying build/lib/apex/contrib/test/fmha/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/fmha
  creating build/bdist.linux-x86_64/wheel/apex/contrib/test/optimizers
  copying build/lib/apex/contrib/test/optimizers/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/optimizers
  copying build/lib/apex/contrib/test/optimizers/test_dist_adam.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/optimizers
  copying build/lib/apex/contrib/test/optimizers/test_distributed_fused_lamb.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/optimizers
  creating build/bdist.linux-x86_64/wheel/apex/contrib/test/bottleneck
  copying build/lib/apex/contrib/test/bottleneck/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/bottleneck
  copying build/lib/apex/contrib/test/bottleneck/test_bottleneck_module.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/bottleneck
  copying build/lib/apex/contrib/test/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test
  creating build/bdist.linux-x86_64/wheel/apex/contrib/test/index_mul_2d
  copying build/lib/apex/contrib/test/index_mul_2d/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/index_mul_2d
  copying build/lib/apex/contrib/test/index_mul_2d/test_index_mul_2d.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/index_mul_2d
  creating build/bdist.linux-x86_64/wheel/apex/contrib/test/layer_norm
  copying build/lib/apex/contrib/test/layer_norm/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/layer_norm
  copying build/lib/apex/contrib/test/layer_norm/test_fast_layer_norm.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/layer_norm
  creating build/bdist.linux-x86_64/wheel/apex/contrib/test/peer_memory
  copying build/lib/apex/contrib/test/peer_memory/test_peer_halo_exchange_module.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/peer_memory
  copying build/lib/apex/contrib/test/peer_memory/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/peer_memory
  creating build/bdist.linux-x86_64/wheel/apex/contrib/test/multihead_attn
  copying build/lib/apex/contrib/test/multihead_attn/test_mha_fused_softmax.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/multihead_attn
  copying build/lib/apex/contrib/test/multihead_attn/test_encdec_multihead_attn_norm_add.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/multihead_attn
  copying build/lib/apex/contrib/test/multihead_attn/test_self_multihead_attn.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/multihead_attn
  copying build/lib/apex/contrib/test/multihead_attn/test_self_multihead_attn_norm_add.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/multihead_attn
  copying build/lib/apex/contrib/test/multihead_attn/test_encdec_multihead_attn.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/multihead_attn
  copying build/lib/apex/contrib/test/multihead_attn/test_fast_self_multihead_attn_bias.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/multihead_attn
  copying build/lib/apex/contrib/test/multihead_attn/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/multihead_attn
  creating build/bdist.linux-x86_64/wheel/apex/contrib/test/focal_loss
  copying build/lib/apex/contrib/test/focal_loss/test_focal_loss.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/focal_loss
  copying build/lib/apex/contrib/test/focal_loss/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/focal_loss
  creating build/bdist.linux-x86_64/wheel/apex/contrib/test/group_norm
  copying build/lib/apex/contrib/test/group_norm/test_group_norm.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/group_norm
  copying build/lib/apex/contrib/test/group_norm/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/test/group_norm
  creating build/bdist.linux-x86_64/wheel/apex/contrib/cudnn_gbn
  copying build/lib/apex/contrib/cudnn_gbn/batch_norm.py -> build/bdist.linux-x86_64/wheel/apex/contrib/cudnn_gbn
  copying build/lib/apex/contrib/cudnn_gbn/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/cudnn_gbn
  creating build/bdist.linux-x86_64/wheel/apex/contrib/conv_bias_relu
  copying build/lib/apex/contrib/conv_bias_relu/conv_bias_relu.py -> build/bdist.linux-x86_64/wheel/apex/contrib/conv_bias_relu
  copying build/lib/apex/contrib/conv_bias_relu/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/conv_bias_relu
  creating build/bdist.linux-x86_64/wheel/apex/contrib/fmha
  copying build/lib/apex/contrib/fmha/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/fmha
  copying build/lib/apex/contrib/fmha/fmha.py -> build/bdist.linux-x86_64/wheel/apex/contrib/fmha
  creating build/bdist.linux-x86_64/wheel/apex/contrib/optimizers
  copying build/lib/apex/contrib/optimizers/fused_lamb.py -> build/bdist.linux-x86_64/wheel/apex/contrib/optimizers
  copying build/lib/apex/contrib/optimizers/fused_sgd.py -> build/bdist.linux-x86_64/wheel/apex/contrib/optimizers
  copying build/lib/apex/contrib/optimizers/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/optimizers
  copying build/lib/apex/contrib/optimizers/fused_adam.py -> build/bdist.linux-x86_64/wheel/apex/contrib/optimizers
  copying build/lib/apex/contrib/optimizers/fp16_optimizer.py -> build/bdist.linux-x86_64/wheel/apex/contrib/optimizers
  copying build/lib/apex/contrib/optimizers/distributed_fused_lamb.py -> build/bdist.linux-x86_64/wheel/apex/contrib/optimizers
  copying build/lib/apex/contrib/optimizers/distributed_fused_adam.py -> build/bdist.linux-x86_64/wheel/apex/contrib/optimizers
  creating build/bdist.linux-x86_64/wheel/apex/contrib/bottleneck
  copying build/lib/apex/contrib/bottleneck/bottleneck.py -> build/bdist.linux-x86_64/wheel/apex/contrib/bottleneck
  copying build/lib/apex/contrib/bottleneck/test.py -> build/bdist.linux-x86_64/wheel/apex/contrib/bottleneck
  copying build/lib/apex/contrib/bottleneck/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/bottleneck
  copying build/lib/apex/contrib/bottleneck/halo_exchangers.py -> build/bdist.linux-x86_64/wheel/apex/contrib/bottleneck
  copying build/lib/apex/contrib/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib
  creating build/bdist.linux-x86_64/wheel/apex/contrib/index_mul_2d
  copying build/lib/apex/contrib/index_mul_2d/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/index_mul_2d
  copying build/lib/apex/contrib/index_mul_2d/index_mul_2d.py -> build/bdist.linux-x86_64/wheel/apex/contrib/index_mul_2d
  creating build/bdist.linux-x86_64/wheel/apex/contrib/sparsity
  copying build/lib/apex/contrib/sparsity/sparse_masklib.py -> build/bdist.linux-x86_64/wheel/apex/contrib/sparsity
  copying build/lib/apex/contrib/sparsity/permutation_lib.py -> build/bdist.linux-x86_64/wheel/apex/contrib/sparsity
  creating build/bdist.linux-x86_64/wheel/apex/contrib/sparsity/permutation_search_kernels
  copying build/lib/apex/contrib/sparsity/permutation_search_kernels/call_permutation_search_kernels.py -> build/bdist.linux-x86_64/wheel/apex/contrib/sparsity/permutation_search_kernels
  copying build/lib/apex/contrib/sparsity/permutation_search_kernels/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/sparsity/permutation_search_kernels
  copying build/lib/apex/contrib/sparsity/permutation_search_kernels/permutation_utilities.py -> build/bdist.linux-x86_64/wheel/apex/contrib/sparsity/permutation_search_kernels
  copying build/lib/apex/contrib/sparsity/permutation_search_kernels/channel_swap.py -> build/bdist.linux-x86_64/wheel/apex/contrib/sparsity/permutation_search_kernels
  copying build/lib/apex/contrib/sparsity/permutation_search_kernels/exhaustive_search.py -> build/bdist.linux-x86_64/wheel/apex/contrib/sparsity/permutation_search_kernels
  copying build/lib/apex/contrib/sparsity/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/sparsity
  copying build/lib/apex/contrib/sparsity/asp.py -> build/bdist.linux-x86_64/wheel/apex/contrib/sparsity
  creating build/bdist.linux-x86_64/wheel/apex/contrib/groupbn
  copying build/lib/apex/contrib/groupbn/batch_norm.py -> build/bdist.linux-x86_64/wheel/apex/contrib/groupbn
  copying build/lib/apex/contrib/groupbn/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/groupbn
  creating build/bdist.linux-x86_64/wheel/apex/contrib/layer_norm
  copying build/lib/apex/contrib/layer_norm/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/layer_norm
  copying build/lib/apex/contrib/layer_norm/layer_norm.py -> build/bdist.linux-x86_64/wheel/apex/contrib/layer_norm
  creating build/bdist.linux-x86_64/wheel/apex/contrib/peer_memory
  copying build/lib/apex/contrib/peer_memory/peer_halo_exchanger_1d.py -> build/bdist.linux-x86_64/wheel/apex/contrib/peer_memory
  copying build/lib/apex/contrib/peer_memory/peer_memory.py -> build/bdist.linux-x86_64/wheel/apex/contrib/peer_memory
  copying build/lib/apex/contrib/peer_memory/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/peer_memory
  creating build/bdist.linux-x86_64/wheel/apex/contrib/multihead_attn
  copying build/lib/apex/contrib/multihead_attn/self_multihead_attn_func.py -> build/bdist.linux-x86_64/wheel/apex/contrib/multihead_attn
  copying build/lib/apex/contrib/multihead_attn/fast_self_multihead_attn_norm_add_func.py -> build/bdist.linux-x86_64/wheel/apex/contrib/multihead_attn
  copying build/lib/apex/contrib/multihead_attn/fast_encdec_multihead_attn_norm_add_func.py -> build/bdist.linux-x86_64/wheel/apex/contrib/multihead_attn
  copying build/lib/apex/contrib/multihead_attn/encdec_multihead_attn_func.py -> build/bdist.linux-x86_64/wheel/apex/contrib/multihead_attn
  copying build/lib/apex/contrib/multihead_attn/fast_self_multihead_attn_func.py -> build/bdist.linux-x86_64/wheel/apex/contrib/multihead_attn
  copying build/lib/apex/contrib/multihead_attn/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/multihead_attn
  copying build/lib/apex/contrib/multihead_attn/mask_softmax_dropout_func.py -> build/bdist.linux-x86_64/wheel/apex/contrib/multihead_attn
  copying build/lib/apex/contrib/multihead_attn/fast_encdec_multihead_attn_func.py -> build/bdist.linux-x86_64/wheel/apex/contrib/multihead_attn
  copying build/lib/apex/contrib/multihead_attn/encdec_multihead_attn.py -> build/bdist.linux-x86_64/wheel/apex/contrib/multihead_attn
  copying build/lib/apex/contrib/multihead_attn/self_multihead_attn.py -> build/bdist.linux-x86_64/wheel/apex/contrib/multihead_attn
  creating build/bdist.linux-x86_64/wheel/apex/contrib/focal_loss
  copying build/lib/apex/contrib/focal_loss/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/focal_loss
  copying build/lib/apex/contrib/focal_loss/focal_loss.py -> build/bdist.linux-x86_64/wheel/apex/contrib/focal_loss
  creating build/bdist.linux-x86_64/wheel/apex/contrib/group_norm
  copying build/lib/apex/contrib/group_norm/group_norm.py -> build/bdist.linux-x86_64/wheel/apex/contrib/group_norm
  copying build/lib/apex/contrib/group_norm/__init__.py -> build/bdist.linux-x86_64/wheel/apex/contrib/group_norm
  creating build/bdist.linux-x86_64/wheel/apex/RNN
  copying build/lib/apex/RNN/models.py -> build/bdist.linux-x86_64/wheel/apex/RNN
  copying build/lib/apex/RNN/RNNBackend.py -> build/bdist.linux-x86_64/wheel/apex/RNN
  copying build/lib/apex/RNN/__init__.py -> build/bdist.linux-x86_64/wheel/apex/RNN
  copying build/lib/apex/RNN/cells.py -> build/bdist.linux-x86_64/wheel/apex/RNN
  creating build/bdist.linux-x86_64/wheel/apex/transformer
  creating build/bdist.linux-x86_64/wheel/apex/transformer/functional
  copying build/lib/apex/transformer/functional/fused_softmax.py -> build/bdist.linux-x86_64/wheel/apex/transformer/functional
  copying build/lib/apex/transformer/functional/__init__.py -> build/bdist.linux-x86_64/wheel/apex/transformer/functional
  copying build/lib/apex/transformer/microbatches.py -> build/bdist.linux-x86_64/wheel/apex/transformer
  creating build/bdist.linux-x86_64/wheel/apex/transformer/tensor_parallel
  copying build/lib/apex/transformer/tensor_parallel/random.py -> build/bdist.linux-x86_64/wheel/apex/transformer/tensor_parallel
  copying build/lib/apex/transformer/tensor_parallel/data.py -> build/bdist.linux-x86_64/wheel/apex/transformer/tensor_parallel
  copying build/lib/apex/transformer/tensor_parallel/__init__.py -> build/bdist.linux-x86_64/wheel/apex/transformer/tensor_parallel
  copying build/lib/apex/transformer/tensor_parallel/layers.py -> build/bdist.linux-x86_64/wheel/apex/transformer/tensor_parallel
  copying build/lib/apex/transformer/tensor_parallel/mappings.py -> build/bdist.linux-x86_64/wheel/apex/transformer/tensor_parallel
  copying build/lib/apex/transformer/tensor_parallel/utils.py -> build/bdist.linux-x86_64/wheel/apex/transformer/tensor_parallel
  copying build/lib/apex/transformer/tensor_parallel/cross_entropy.py -> build/bdist.linux-x86_64/wheel/apex/transformer/tensor_parallel
  copying build/lib/apex/transformer/tensor_parallel/memory.py -> build/bdist.linux-x86_64/wheel/apex/transformer/tensor_parallel
  creating build/bdist.linux-x86_64/wheel/apex/transformer/amp
  copying build/lib/apex/transformer/amp/__init__.py -> build/bdist.linux-x86_64/wheel/apex/transformer/amp
  copying build/lib/apex/transformer/amp/grad_scaler.py -> build/bdist.linux-x86_64/wheel/apex/transformer/amp
  copying build/lib/apex/transformer/log_util.py -> build/bdist.linux-x86_64/wheel/apex/transformer
  copying build/lib/apex/transformer/__init__.py -> build/bdist.linux-x86_64/wheel/apex/transformer
  copying build/lib/apex/transformer/parallel_state.py -> build/bdist.linux-x86_64/wheel/apex/transformer
  copying build/lib/apex/transformer/utils.py -> build/bdist.linux-x86_64/wheel/apex/transformer
  creating build/bdist.linux-x86_64/wheel/apex/transformer/_data
  copying build/lib/apex/transformer/_data/__init__.py -> build/bdist.linux-x86_64/wheel/apex/transformer/_data
  copying build/lib/apex/transformer/_data/_batchsampler.py -> build/bdist.linux-x86_64/wheel/apex/transformer/_data
  creating build/bdist.linux-x86_64/wheel/apex/transformer/pipeline_parallel
  copying build/lib/apex/transformer/pipeline_parallel/_timers.py -> build/bdist.linux-x86_64/wheel/apex/transformer/pipeline_parallel
  copying build/lib/apex/transformer/pipeline_parallel/p2p_communication.py -> build/bdist.linux-x86_64/wheel/apex/transformer/pipeline_parallel
  copying build/lib/apex/transformer/pipeline_parallel/__init__.py -> build/bdist.linux-x86_64/wheel/apex/transformer/pipeline_parallel
  copying build/lib/apex/transformer/pipeline_parallel/utils.py -> build/bdist.linux-x86_64/wheel/apex/transformer/pipeline_parallel
  creating build/bdist.linux-x86_64/wheel/apex/transformer/pipeline_parallel/schedules
  copying build/lib/apex/transformer/pipeline_parallel/schedules/__init__.py -> build/bdist.linux-x86_64/wheel/apex/transformer/pipeline_parallel/schedules
  copying build/lib/apex/transformer/pipeline_parallel/schedules/fwd_bwd_pipelining_with_interleaving.py -> build/bdist.linux-x86_64/wheel/apex/transformer/pipeline_parallel/schedules
  copying build/lib/apex/transformer/pipeline_parallel/schedules/fwd_bwd_pipelining_without_interleaving.py -> build/bdist.linux-x86_64/wheel/apex/transformer/pipeline_parallel/schedules
  copying build/lib/apex/transformer/pipeline_parallel/schedules/common.py -> build/bdist.linux-x86_64/wheel/apex/transformer/pipeline_parallel/schedules
  copying build/lib/apex/transformer/pipeline_parallel/schedules/fwd_bwd_no_pipelining.py -> build/bdist.linux-x86_64/wheel/apex/transformer/pipeline_parallel/schedules
  copying build/lib/apex/transformer/_ucc_util.py -> build/bdist.linux-x86_64/wheel/apex/transformer
  creating build/bdist.linux-x86_64/wheel/apex/transformer/testing
  copying build/lib/apex/transformer/testing/commons.py -> build/bdist.linux-x86_64/wheel/apex/transformer/testing
  copying build/lib/apex/transformer/testing/standalone_gpt.py -> build/bdist.linux-x86_64/wheel/apex/transformer/testing
  copying build/lib/apex/transformer/testing/standalone_bert.py -> build/bdist.linux-x86_64/wheel/apex/transformer/testing
  copying build/lib/apex/transformer/testing/__init__.py -> build/bdist.linux-x86_64/wheel/apex/transformer/testing
  copying build/lib/apex/transformer/testing/standalone_transformer_lm.py -> build/bdist.linux-x86_64/wheel/apex/transformer/testing
  copying build/lib/apex/transformer/testing/distributed_test_base.py -> build/bdist.linux-x86_64/wheel/apex/transformer/testing
  copying build/lib/apex/transformer/testing/global_vars.py -> build/bdist.linux-x86_64/wheel/apex/transformer/testing
  copying build/lib/apex/transformer/testing/arguments.py -> build/bdist.linux-x86_64/wheel/apex/transformer/testing
  creating build/bdist.linux-x86_64/wheel/apex/transformer/layers
  copying build/lib/apex/transformer/layers/__init__.py -> build/bdist.linux-x86_64/wheel/apex/transformer/layers
  copying build/lib/apex/transformer/layers/layer_norm.py -> build/bdist.linux-x86_64/wheel/apex/transformer/layers
  copying build/lib/apex/transformer/enums.py -> build/bdist.linux-x86_64/wheel/apex/transformer
  copying build/lib/apex/_autocast_utils.py -> build/bdist.linux-x86_64/wheel/apex
  running install_egg_info
  running egg_info
  writing apex.egg-info/PKG-INFO
  writing dependency_links to apex.egg-info/dependency_links.txt
  writing requirements to apex.egg-info/requires.txt
  writing top-level names to apex.egg-info/top_level.txt
  reading manifest file 'apex.egg-info/SOURCES.txt'
  writing manifest file 'apex.egg-info/SOURCES.txt'
  Copying apex.egg-info to build/bdist.linux-x86_64/wheel/apex-0.1.egg-info
  running install_scripts
  adding license file "LICENSE" (matched pattern "LICEN[CS]E*")
  creating build/bdist.linux-x86_64/wheel/apex-0.1.dist-info/WHEEL
  creating '/tmp/pip-wheel-z6ommft2/tmpu15x04qw/apex-0.1-py3-none-any.whl' and adding 'build/bdist.linux-x86_64/wheel' to it
  adding 'apex/__init__.py'
  adding 'apex/_autocast_utils.py'
  adding 'apex/RNN/RNNBackend.py'
  adding 'apex/RNN/__init__.py'
  adding 'apex/RNN/cells.py'
  adding 'apex/RNN/models.py'
  adding 'apex/amp/__init__.py'
  adding 'apex/amp/__version__.py'
  adding 'apex/amp/_amp_state.py'
  adding 'apex/amp/_initialize.py'
  adding 'apex/amp/_process_optimizer.py'
  adding 'apex/amp/amp.py'
  adding 'apex/amp/compat.py'
  adding 'apex/amp/frontend.py'
  adding 'apex/amp/handle.py'
  adding 'apex/amp/opt.py'
  adding 'apex/amp/rnn_compat.py'
  adding 'apex/amp/scaler.py'
  adding 'apex/amp/utils.py'
  adding 'apex/amp/wrap.py'
  adding 'apex/amp/lists/__init__.py'
  adding 'apex/amp/lists/functional_overrides.py'
  adding 'apex/amp/lists/tensor_overrides.py'
  adding 'apex/amp/lists/torch_overrides.py'
  adding 'apex/contrib/__init__.py'
  adding 'apex/contrib/bottleneck/__init__.py'
  adding 'apex/contrib/bottleneck/bottleneck.py'
  adding 'apex/contrib/bottleneck/halo_exchangers.py'
  adding 'apex/contrib/bottleneck/test.py'
  adding 'apex/contrib/clip_grad/__init__.py'
  adding 'apex/contrib/clip_grad/clip_grad.py'
  adding 'apex/contrib/conv_bias_relu/__init__.py'
  adding 'apex/contrib/conv_bias_relu/conv_bias_relu.py'
  adding 'apex/contrib/cudnn_gbn/__init__.py'
  adding 'apex/contrib/cudnn_gbn/batch_norm.py'
  adding 'apex/contrib/fmha/__init__.py'
  adding 'apex/contrib/fmha/fmha.py'
  adding 'apex/contrib/focal_loss/__init__.py'
  adding 'apex/contrib/focal_loss/focal_loss.py'
  adding 'apex/contrib/group_norm/__init__.py'
  adding 'apex/contrib/group_norm/group_norm.py'
  adding 'apex/contrib/groupbn/__init__.py'
  adding 'apex/contrib/groupbn/batch_norm.py'
  adding 'apex/contrib/index_mul_2d/__init__.py'
  adding 'apex/contrib/index_mul_2d/index_mul_2d.py'
  adding 'apex/contrib/layer_norm/__init__.py'
  adding 'apex/contrib/layer_norm/layer_norm.py'
  adding 'apex/contrib/multihead_attn/__init__.py'
  adding 'apex/contrib/multihead_attn/encdec_multihead_attn.py'
  adding 'apex/contrib/multihead_attn/encdec_multihead_attn_func.py'
  adding 'apex/contrib/multihead_attn/fast_encdec_multihead_attn_func.py'
  adding 'apex/contrib/multihead_attn/fast_encdec_multihead_attn_norm_add_func.py'
  adding 'apex/contrib/multihead_attn/fast_self_multihead_attn_func.py'
  adding 'apex/contrib/multihead_attn/fast_self_multihead_attn_norm_add_func.py'
  adding 'apex/contrib/multihead_attn/mask_softmax_dropout_func.py'
  adding 'apex/contrib/multihead_attn/self_multihead_attn.py'
  adding 'apex/contrib/multihead_attn/self_multihead_attn_func.py'
  adding 'apex/contrib/optimizers/__init__.py'
  adding 'apex/contrib/optimizers/distributed_fused_adam.py'
  adding 'apex/contrib/optimizers/distributed_fused_lamb.py'
  adding 'apex/contrib/optimizers/fp16_optimizer.py'
  adding 'apex/contrib/optimizers/fused_adam.py'
  adding 'apex/contrib/optimizers/fused_lamb.py'
  adding 'apex/contrib/optimizers/fused_sgd.py'
  adding 'apex/contrib/peer_memory/__init__.py'
  adding 'apex/contrib/peer_memory/peer_halo_exchanger_1d.py'
  adding 'apex/contrib/peer_memory/peer_memory.py'
  adding 'apex/contrib/sparsity/__init__.py'
  adding 'apex/contrib/sparsity/asp.py'
  adding 'apex/contrib/sparsity/permutation_lib.py'
  adding 'apex/contrib/sparsity/sparse_masklib.py'
  adding 'apex/contrib/sparsity/permutation_search_kernels/__init__.py'
  adding 'apex/contrib/sparsity/permutation_search_kernels/call_permutation_search_kernels.py'
  adding 'apex/contrib/sparsity/permutation_search_kernels/channel_swap.py'
  adding 'apex/contrib/sparsity/permutation_search_kernels/exhaustive_search.py'
  adding 'apex/contrib/sparsity/permutation_search_kernels/permutation_utilities.py'
  adding 'apex/contrib/test/__init__.py'
  adding 'apex/contrib/test/bottleneck/__init__.py'
  adding 'apex/contrib/test/bottleneck/test_bottleneck_module.py'
  adding 'apex/contrib/test/clip_grad/__init__.py'
  adding 'apex/contrib/test/clip_grad/test_clip_grad.py'
  adding 'apex/contrib/test/conv_bias_relu/__init__.py'
  adding 'apex/contrib/test/conv_bias_relu/test_conv_bias_relu.py'
  adding 'apex/contrib/test/cudnn_gbn/__init__.py'
  adding 'apex/contrib/test/cudnn_gbn/test_cudnn_gbn_with_two_gpus.py'
  adding 'apex/contrib/test/fmha/__init__.py'
  adding 'apex/contrib/test/fmha/test_fmha.py'
  adding 'apex/contrib/test/focal_loss/__init__.py'
  adding 'apex/contrib/test/focal_loss/test_focal_loss.py'
  adding 'apex/contrib/test/group_norm/__init__.py'
  adding 'apex/contrib/test/group_norm/test_group_norm.py'
  adding 'apex/contrib/test/index_mul_2d/__init__.py'
  adding 'apex/contrib/test/index_mul_2d/test_index_mul_2d.py'
  adding 'apex/contrib/test/layer_norm/__init__.py'
  adding 'apex/contrib/test/layer_norm/test_fast_layer_norm.py'
  adding 'apex/contrib/test/multihead_attn/__init__.py'
  adding 'apex/contrib/test/multihead_attn/test_encdec_multihead_attn.py'
  adding 'apex/contrib/test/multihead_attn/test_encdec_multihead_attn_norm_add.py'
  adding 'apex/contrib/test/multihead_attn/test_fast_self_multihead_attn_bias.py'
  adding 'apex/contrib/test/multihead_attn/test_mha_fused_softmax.py'
  adding 'apex/contrib/test/multihead_attn/test_self_multihead_attn.py'
  adding 'apex/contrib/test/multihead_attn/test_self_multihead_attn_norm_add.py'
  adding 'apex/contrib/test/optimizers/__init__.py'
  adding 'apex/contrib/test/optimizers/test_dist_adam.py'
  adding 'apex/contrib/test/optimizers/test_distributed_fused_lamb.py'
  adding 'apex/contrib/test/peer_memory/__init__.py'
  adding 'apex/contrib/test/peer_memory/test_peer_halo_exchange_module.py'
  adding 'apex/contrib/test/transducer/__init__.py'
  adding 'apex/contrib/test/transducer/test_transducer_joint.py'
  adding 'apex/contrib/test/transducer/test_transducer_loss.py'
  adding 'apex/contrib/test/xentropy/__init__.py'
  adding 'apex/contrib/test/xentropy/test_label_smoothing.py'
  adding 'apex/contrib/transducer/__init__.py'
  adding 'apex/contrib/transducer/_transducer_ref.py'
  adding 'apex/contrib/transducer/transducer.py'
  adding 'apex/contrib/xentropy/__init__.py'
  adding 'apex/contrib/xentropy/softmax_xentropy.py'
  adding 'apex/fp16_utils/__init__.py'
  adding 'apex/fp16_utils/fp16_optimizer.py'
  adding 'apex/fp16_utils/fp16util.py'
  adding 'apex/fp16_utils/loss_scaler.py'
  adding 'apex/fused_dense/__init__.py'
  adding 'apex/fused_dense/fused_dense.py'
  adding 'apex/mlp/__init__.py'
  adding 'apex/mlp/mlp.py'
  adding 'apex/multi_tensor_apply/__init__.py'
  adding 'apex/multi_tensor_apply/multi_tensor_apply.py'
  adding 'apex/normalization/__init__.py'
  adding 'apex/normalization/fused_layer_norm.py'
  adding 'apex/optimizers/__init__.py'
  adding 'apex/optimizers/fused_adagrad.py'
  adding 'apex/optimizers/fused_adam.py'
  adding 'apex/optimizers/fused_lamb.py'
  adding 'apex/optimizers/fused_mixed_precision_lamb.py'
  adding 'apex/optimizers/fused_novograd.py'
  adding 'apex/optimizers/fused_sgd.py'
  adding 'apex/parallel/LARC.py'
  adding 'apex/parallel/__init__.py'
  adding 'apex/parallel/distributed.py'
  adding 'apex/parallel/multiproc.py'
  adding 'apex/parallel/optimized_sync_batchnorm.py'
  adding 'apex/parallel/optimized_sync_batchnorm_kernel.py'
  adding 'apex/parallel/sync_batchnorm.py'
  adding 'apex/parallel/sync_batchnorm_kernel.py'
  adding 'apex/transformer/__init__.py'
  adding 'apex/transformer/_ucc_util.py'
  adding 'apex/transformer/enums.py'
  adding 'apex/transformer/log_util.py'
  adding 'apex/transformer/microbatches.py'
  adding 'apex/transformer/parallel_state.py'
  adding 'apex/transformer/utils.py'
  adding 'apex/transformer/_data/__init__.py'
  adding 'apex/transformer/_data/_batchsampler.py'
  adding 'apex/transformer/amp/__init__.py'
  adding 'apex/transformer/amp/grad_scaler.py'
  adding 'apex/transformer/functional/__init__.py'
  adding 'apex/transformer/functional/fused_softmax.py'
  adding 'apex/transformer/layers/__init__.py'
  adding 'apex/transformer/layers/layer_norm.py'
  adding 'apex/transformer/pipeline_parallel/__init__.py'
  adding 'apex/transformer/pipeline_parallel/_timers.py'
  adding 'apex/transformer/pipeline_parallel/p2p_communication.py'
  adding 'apex/transformer/pipeline_parallel/utils.py'
  adding 'apex/transformer/pipeline_parallel/schedules/__init__.py'
  adding 'apex/transformer/pipeline_parallel/schedules/common.py'
  adding 'apex/transformer/pipeline_parallel/schedules/fwd_bwd_no_pipelining.py'
  adding 'apex/transformer/pipeline_parallel/schedules/fwd_bwd_pipelining_with_interleaving.py'
  adding 'apex/transformer/pipeline_parallel/schedules/fwd_bwd_pipelining_without_interleaving.py'
  adding 'apex/transformer/tensor_parallel/__init__.py'
  adding 'apex/transformer/tensor_parallel/cross_entropy.py'
  adding 'apex/transformer/tensor_parallel/data.py'
  adding 'apex/transformer/tensor_parallel/layers.py'
  adding 'apex/transformer/tensor_parallel/mappings.py'
  adding 'apex/transformer/tensor_parallel/memory.py'
  adding 'apex/transformer/tensor_parallel/random.py'
  adding 'apex/transformer/tensor_parallel/utils.py'
  adding 'apex/transformer/testing/__init__.py'
  adding 'apex/transformer/testing/arguments.py'
  adding 'apex/transformer/testing/commons.py'
  adding 'apex/transformer/testing/distributed_test_base.py'
  adding 'apex/transformer/testing/global_vars.py'
  adding 'apex/transformer/testing/standalone_bert.py'
  adding 'apex/transformer/testing/standalone_gpt.py'
  adding 'apex/transformer/testing/standalone_transformer_lm.py'
  adding 'apex-0.1.dist-info/LICENSE'
  adding 'apex-0.1.dist-info/METADATA'
  adding 'apex-0.1.dist-info/WHEEL'
  adding 'apex-0.1.dist-info/top_level.txt'
  adding 'apex-0.1.dist-info/RECORD'
  removing build/bdist.linux-x86_64/wheel
  Building wheel for apex (pyproject.toml) ... done
  Created wheel for apex: filename=apex-0.1-py3-none-any.whl size=374658 sha256=eba06564d62b42c38a139eba7b4024dd8f60d3adad937a177c55edf16e2992e0
  Stored in directory: /tmp/pip-ephem-wheel-cache-46vqsl6e/wheels/10/e6/55/d4c7b107f5340367a167d8c9d527b96aa577be102624338257
Successfully built apex
Installing collected packages: apex
  Attempting uninstall: apex
    Found existing installation: apex 0.1
    Can't uninstall 'apex'. No files were found to uninstall.
Successfully installed apex-0.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

Environment Inside the NGC Docker container nvidia/cuda:11.7.1-devel-ubuntu20.04.

root@ebd430064a0b:/tmp/pip-req-build-brdy43s6# python -m torch.utils.collect_env
Collecting environment information...
PyTorch version: 1.13.0+cu117
Is debug build: False
CUDA used to build PyTorch: 11.7
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.6 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
Clang version: Could not collect
CMake version: version 3.28.1
Libc version: glibc-2.31

Python version: 3.8.10 (default, Nov 22 2023, 10:22:35)  [GCC 9.4.0] (64-bit runtime)
Python platform: Linux-5.15.0-1050-aws-x86_64-with-glibc2.29
Is CUDA available: True
CUDA runtime version: 11.7.99
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: Tesla T4
Nvidia driver version: 535.104.12
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.24.4
[pip3] torch==1.13.0+cu117
[pip3] torchaudio==0.13.0+cu117
[pip3] torchvision==0.14.0+cu117
[conda] Could not collect
ce107 commented 10 months ago

I am having the same issue and others are seeing it here as well: https://github.com/NVIDIA/Megatron-LM/issues/147

OsaydAbdu commented 10 months ago

This solved the problem for me,

pip install --upgrade setuptools

Apparently, you need setuptools>=49.4.0 for this to work.

aamirshafi commented 1 month ago

Please make sure that CUDA_HOME is set before installing the package through the pip command. The setup.py will ignore building the native part of the library if CUDA_HOME is not set. I was seeing the no module named 'amp_C' because of this.