Closed zixiliuUSC closed 1 year ago
Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑🤝🧑👫🧑🏿🤝🧑🏻👩🏾🤝👨🏿👬🏿
Title: [BUG]: The installation cannot be completed according to the official tutorial
copying colossalai/fx/profiler/experimental/profiler_module/pooling.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/profiler/experimental/profiler_module
creating build/lib.linux-x86_64-cpython-39/colossalai/fx/passes/algorithms
copying colossalai/fx/passes/algorithms/ckpt_solver_chen.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/passes/algorithms
copying colossalai/fx/passes/algorithms/linearize.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/passes/algorithms
copying colossalai/fx/passes/algorithms/build_c_ext.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/passes/algorithms
copying colossalai/fx/passes/algorithms/ckpt_solver_pofo.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/passes/algorithms
copying colossalai/fx/passes/algorithms/operation.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/passes/algorithms
copying colossalai/fx/passes/algorithms/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/passes/algorithms
copying colossalai/fx/passes/algorithms/ckpt_solver_rotor.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/passes/algorithms
creating build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/bias_addition_patch
copying colossalai/fx/tracer/bias_addition_patch/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/bias_addition_patch
creating build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/meta_patch
copying colossalai/fx/tracer/meta_patch/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/meta_patch
creating build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_module
copying colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_module/linear.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_module
copying colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_module/bias_addition_module.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_module
copying colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_module/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_module
copying colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_module/conv.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_module
creating build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_function
copying colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_function/addmm.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_function
copying colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_function/bias_addition_function.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_function
copying colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_function/addbmm.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_function
copying colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_function/linear.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_function
copying colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_function/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_function
creating build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/meta_patch/patched_module
copying colossalai/fx/tracer/meta_patch/patched_module/rnn.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/meta_patch/patched_module
copying colossalai/fx/tracer/meta_patch/patched_module/convolution.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/meta_patch/patched_module
copying colossalai/fx/tracer/meta_patch/patched_module/normalization.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/meta_patch/patched_module
copying colossalai/fx/tracer/meta_patch/patched_module/activation_function.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/meta_patch/patched_module
copying colossalai/fx/tracer/meta_patch/patched_module/embedding.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/meta_patch/patched_module
copying colossalai/fx/tracer/meta_patch/patched_module/linear.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/meta_patch/patched_module
copying colossalai/fx/tracer/meta_patch/patched_module/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/meta_patch/patched_module
copying colossalai/fx/tracer/meta_patch/patched_module/pooling.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/meta_patch/patched_module
creating build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/meta_patch/patched_function
copying colossalai/fx/tracer/meta_patch/patched_function/python_ops.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/meta_patch/patched_function
copying colossalai/fx/tracer/meta_patch/patched_function/torch_ops.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/meta_patch/patched_function
copying colossalai/fx/tracer/meta_patch/patched_function/convolution.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/meta_patch/patched_function
copying colossalai/fx/tracer/meta_patch/patched_function/normalization.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/meta_patch/patched_function
copying colossalai/fx/tracer/meta_patch/patched_function/activation_function.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/meta_patch/patched_function
copying colossalai/fx/tracer/meta_patch/patched_function/embedding.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/meta_patch/patched_function
copying colossalai/fx/tracer/meta_patch/patched_function/arithmetic.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/meta_patch/patched_function
copying colossalai/fx/tracer/meta_patch/patched_function/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/fx/tracer/meta_patch/patched_function
creating build/lib.linux-x86_64-cpython-39/colossalai/cli/launcher
copying colossalai/cli/launcher/multinode_runner.py -> build/lib.linux-x86_64-cpython-39/colossalai/cli/launcher
copying colossalai/cli/launcher/hostinfo.py -> build/lib.linux-x86_64-cpython-39/colossalai/cli/launcher
copying colossalai/cli/launcher/run.py -> build/lib.linux-x86_64-cpython-39/colossalai/cli/launcher
copying colossalai/cli/launcher/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/cli/launcher
creating build/lib.linux-x86_64-cpython-39/colossalai/cli/check
copying colossalai/cli/check/check_installation.py -> build/lib.linux-x86_64-cpython-39/colossalai/cli/check
copying colossalai/cli/check/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/cli/check
creating build/lib.linux-x86_64-cpython-39/colossalai/cli/benchmark
copying colossalai/cli/benchmark/utils.py -> build/lib.linux-x86_64-cpython-39/colossalai/cli/benchmark
copying colossalai/cli/benchmark/models.py -> build/lib.linux-x86_64-cpython-39/colossalai/cli/benchmark
copying colossalai/cli/benchmark/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/cli/benchmark
copying colossalai/cli/benchmark/benchmark.py -> build/lib.linux-x86_64-cpython-39/colossalai/cli/benchmark
creating build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard
copying colossalai/auto_parallel/tensor_shard/options.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard
copying colossalai/auto_parallel/tensor_shard/sharding_strategy.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard
copying colossalai/auto_parallel/tensor_shard/initialize.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard
copying colossalai/auto_parallel/tensor_shard/constants.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard
copying colossalai/auto_parallel/tensor_shard/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard
creating build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/passes
copying colossalai/auto_parallel/passes/runtime_preparation_pass.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/passes
copying colossalai/auto_parallel/passes/runtime_apply_pass.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/passes
copying colossalai/auto_parallel/passes/comm_metainfo_pass.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/passes
copying colossalai/auto_parallel/passes/constants.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/passes
copying colossalai/auto_parallel/passes/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/passes
copying colossalai/auto_parallel/passes/meta_info_prop.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/passes
creating build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/checkpoint
copying colossalai/auto_parallel/checkpoint/ckpt_solver_chen.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/checkpoint
copying colossalai/auto_parallel/checkpoint/build_c_ext.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/checkpoint
copying colossalai/auto_parallel/checkpoint/ckpt_solver_base.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/checkpoint
copying colossalai/auto_parallel/checkpoint/operation.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/checkpoint
copying colossalai/auto_parallel/checkpoint/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/checkpoint
copying colossalai/auto_parallel/checkpoint/ckpt_solver_rotor.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/checkpoint
creating build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/meta_profiler
copying colossalai/auto_parallel/meta_profiler/metainfo.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/meta_profiler
copying colossalai/auto_parallel/meta_profiler/registry.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/meta_profiler
copying colossalai/auto_parallel/meta_profiler/constants.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/meta_profiler
copying colossalai/auto_parallel/meta_profiler/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/meta_profiler
creating build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/pipeline_shard
copying colossalai/auto_parallel/pipeline_shard/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/pipeline_shard
creating build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/utils
copying colossalai/auto_parallel/tensor_shard/utils/factory.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/utils
copying colossalai/auto_parallel/tensor_shard/utils/misc.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/utils
copying colossalai/auto_parallel/tensor_shard/utils/reshape.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/utils
copying colossalai/auto_parallel/tensor_shard/utils/broadcast.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/utils
copying colossalai/auto_parallel/tensor_shard/utils/sharding.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/utils
copying colossalai/auto_parallel/tensor_shard/utils/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/utils
creating build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/permute_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/addmm_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/default_reshape_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/unary_elementwise_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/layer_norm_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/where_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/binary_elementwise_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/normal_pooling_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/getitem_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/output_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/registry.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/tensor_constructor_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/softmax_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/linear_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/view_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/transpose_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/batch_norm_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/sum_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/placeholder_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/node_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/split_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/matmul_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/bmm_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/embedding_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/conv_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
copying colossalai/auto_parallel/tensor_shard/node_handler/getattr_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler
creating build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/solver
copying colossalai/auto_parallel/tensor_shard/solver/cost_graph.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/solver
copying colossalai/auto_parallel/tensor_shard/solver/solver.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/solver
copying colossalai/auto_parallel/tensor_shard/solver/strategies_constructor.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/solver
copying colossalai/auto_parallel/tensor_shard/solver/graph_analysis.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/solver
copying colossalai/auto_parallel/tensor_shard/solver/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/solver
creating build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler/strategy
copying colossalai/auto_parallel/tensor_shard/node_handler/strategy/softmax_generator.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler/strategy
copying colossalai/auto_parallel/tensor_shard/node_handler/strategy/unary_elementwise_generator.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler/strategy
copying colossalai/auto_parallel/tensor_shard/node_handler/strategy/layer_norm_generator.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler/strategy
copying colossalai/auto_parallel/tensor_shard/node_handler/strategy/sum_generator.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler/strategy
copying colossalai/auto_parallel/tensor_shard/node_handler/strategy/strategy_generator.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler/strategy
copying colossalai/auto_parallel/tensor_shard/node_handler/strategy/output_generator.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler/strategy
copying colossalai/auto_parallel/tensor_shard/node_handler/strategy/getitem_generator.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler/strategy
copying colossalai/auto_parallel/tensor_shard/node_handler/strategy/embedding_generator.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler/strategy
copying colossalai/auto_parallel/tensor_shard/node_handler/strategy/tensor_constructor_generator.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler/strategy
copying colossalai/auto_parallel/tensor_shard/node_handler/strategy/reshape_generator.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler/strategy
copying colossalai/auto_parallel/tensor_shard/node_handler/strategy/conv_strategy_generator.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler/strategy
copying colossalai/auto_parallel/tensor_shard/node_handler/strategy/placeholder_generator.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler/strategy
copying colossalai/auto_parallel/tensor_shard/node_handler/strategy/normal_pooling_generator.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler/strategy
copying colossalai/auto_parallel/tensor_shard/node_handler/strategy/batch_norm_generator.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler/strategy
copying colossalai/auto_parallel/tensor_shard/node_handler/strategy/matmul_strategy_generator.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler/strategy
copying colossalai/auto_parallel/tensor_shard/node_handler/strategy/binary_elementwise_generator.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler/strategy
copying colossalai/auto_parallel/tensor_shard/node_handler/strategy/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler/strategy
copying colossalai/auto_parallel/tensor_shard/node_handler/strategy/where_generator.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler/strategy
copying colossalai/auto_parallel/tensor_shard/node_handler/strategy/getattr_generator.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/tensor_shard/node_handler/strategy
creating build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/meta_profiler/meta_registry
copying colossalai/auto_parallel/meta_profiler/meta_registry/non_spmd.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/meta_profiler/meta_registry
copying colossalai/auto_parallel/meta_profiler/meta_registry/where.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/meta_profiler/meta_registry
copying colossalai/auto_parallel/meta_profiler/meta_registry/activation.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/meta_profiler/meta_registry
copying colossalai/auto_parallel/meta_profiler/meta_registry/binary_elementwise_ops.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/meta_profiler/meta_registry
copying colossalai/auto_parallel/meta_profiler/meta_registry/tensor.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/meta_profiler/meta_registry
copying colossalai/auto_parallel/meta_profiler/meta_registry/norm.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/meta_profiler/meta_registry
copying colossalai/auto_parallel/meta_profiler/meta_registry/embedding.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/meta_profiler/meta_registry
copying colossalai/auto_parallel/meta_profiler/meta_registry/linear.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/meta_profiler/meta_registry
copying colossalai/auto_parallel/meta_profiler/meta_registry/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/meta_profiler/meta_registry
copying colossalai/auto_parallel/meta_profiler/meta_registry/conv.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/meta_profiler/meta_registry
copying colossalai/auto_parallel/meta_profiler/meta_registry/pooling.py -> build/lib.linux-x86_64-cpython-39/colossalai/auto_parallel/meta_profiler/meta_registry
creating build/lib.linux-x86_64-cpython-39/colossalai/utils/profiler
copying colossalai/utils/profiler/extention.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/profiler
copying colossalai/utils/profiler/stateful_tensor_mem_extention.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/profiler
copying colossalai/utils/profiler/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/profiler
copying colossalai/utils/profiler/profiler.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/profiler
creating build/lib.linux-x86_64-cpython-39/colossalai/utils/checkpoint
copying colossalai/utils/checkpoint/module_checkpoint.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/checkpoint
copying colossalai/utils/checkpoint/utils.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/checkpoint
copying colossalai/utils/checkpoint/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/checkpoint
creating build/lib.linux-x86_64-cpython-39/colossalai/utils/checkpoint_io
copying colossalai/utils/checkpoint_io/io.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/checkpoint_io
copying colossalai/utils/checkpoint_io/meta.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/checkpoint_io
copying colossalai/utils/checkpoint_io/convertor.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/checkpoint_io
copying colossalai/utils/checkpoint_io/backend.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/checkpoint_io
copying colossalai/utils/checkpoint_io/writer.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/checkpoint_io
copying colossalai/utils/checkpoint_io/utils.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/checkpoint_io
copying colossalai/utils/checkpoint_io/reader.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/checkpoint_io
copying colossalai/utils/checkpoint_io/constant.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/checkpoint_io
copying colossalai/utils/checkpoint_io/distributed.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/checkpoint_io
copying colossalai/utils/checkpoint_io/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/checkpoint_io
creating build/lib.linux-x86_64-cpython-39/colossalai/utils/multi_tensor_apply
copying colossalai/utils/multi_tensor_apply/multi_tensor_apply.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/multi_tensor_apply
copying colossalai/utils/multi_tensor_apply/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/multi_tensor_apply
creating build/lib.linux-x86_64-cpython-39/colossalai/utils/rank_recorder
copying colossalai/utils/rank_recorder/rank_recorder.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/rank_recorder
copying colossalai/utils/rank_recorder/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/rank_recorder
creating build/lib.linux-x86_64-cpython-39/colossalai/utils/model
copying colossalai/utils/model/lazy_init_context.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/model
copying colossalai/utils/model/experimental.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/model
copying colossalai/utils/model/utils.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/model
copying colossalai/utils/model/colo_init_context.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/model
copying colossalai/utils/model/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/model
creating build/lib.linux-x86_64-cpython-39/colossalai/utils/data_sampler
copying colossalai/utils/data_sampler/data_parallel_sampler.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/data_sampler
copying colossalai/utils/data_sampler/base_sampler.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/data_sampler
copying colossalai/utils/data_sampler/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/data_sampler
creating build/lib.linux-x86_64-cpython-39/colossalai/utils/tensor_detector
copying colossalai/utils/tensor_detector/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/tensor_detector
copying colossalai/utils/tensor_detector/tensor_detector.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/tensor_detector
creating build/lib.linux-x86_64-cpython-39/colossalai/utils/profiler/legacy
copying colossalai/utils/profiler/legacy/comm_profiler.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/profiler/legacy
copying colossalai/utils/profiler/legacy/pcie_profiler.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/profiler/legacy
copying colossalai/utils/profiler/legacy/prof_utils.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/profiler/legacy
copying colossalai/utils/profiler/legacy/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/utils/profiler/legacy
creating build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_optim
copying colossalai/zero/sharded_optim/sharded_optim_v2.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_optim
copying colossalai/zero/sharded_optim/low_level_optim.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_optim
copying colossalai/zero/sharded_optim/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_optim
copying colossalai/zero/sharded_optim/_utils.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_optim
creating build/lib.linux-x86_64-cpython-39/colossalai/zero/utils
copying colossalai/zero/utils/gemini_hook.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/utils
copying colossalai/zero/utils/zero_hook.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/utils
copying colossalai/zero/utils/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/utils
creating build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_model
copying colossalai/zero/sharded_model/reduce_scatter.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_model
copying colossalai/zero/sharded_model/sharded_model_v2.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_model
copying colossalai/zero/sharded_model/utils.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_model
copying colossalai/zero/sharded_model/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_model
copying colossalai/zero/sharded_model/_utils.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_model
creating build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_param
copying colossalai/zero/sharded_param/sharded_param.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_param
copying colossalai/zero/sharded_param/sharded_tensor.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_param
copying colossalai/zero/sharded_param/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_param
creating build/lib.linux-x86_64-cpython-39/colossalai/zero/init_ctx
copying colossalai/zero/init_ctx/init_context.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/init_ctx
copying colossalai/zero/init_ctx/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/init_ctx
creating build/lib.linux-x86_64-cpython-39/colossalai/zero/shard_utils
copying colossalai/zero/shard_utils/base_shard_strategy.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/shard_utils
copying colossalai/zero/shard_utils/commons.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/shard_utils
copying colossalai/zero/shard_utils/bucket_tensor_shard_strategy.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/shard_utils
copying colossalai/zero/shard_utils/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/shard_utils
copying colossalai/zero/shard_utils/tensor_shard_strategy.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/shard_utils
creating build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_optim/bookkeeping
copying colossalai/zero/sharded_optim/bookkeeping/parameter_store.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_optim/bookkeeping
copying colossalai/zero/sharded_optim/bookkeeping/gradient_store.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_optim/bookkeeping
copying colossalai/zero/sharded_optim/bookkeeping/base_store.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_optim/bookkeeping
copying colossalai/zero/sharded_optim/bookkeeping/bucket_store.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_optim/bookkeeping
copying colossalai/zero/sharded_optim/bookkeeping/tensor_bucket.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_optim/bookkeeping
copying colossalai/zero/sharded_optim/bookkeeping/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/zero/sharded_optim/bookkeeping
creating build/lib.linux-x86_64-cpython-39/colossalai/amp/torch_amp
copying colossalai/amp/torch_amp/_grad_scaler.py -> build/lib.linux-x86_64-cpython-39/colossalai/amp/torch_amp
copying colossalai/amp/torch_amp/torch_amp.py -> build/lib.linux-x86_64-cpython-39/colossalai/amp/torch_amp
copying colossalai/amp/torch_amp/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/amp/torch_amp
creating build/lib.linux-x86_64-cpython-39/colossalai/amp/naive_amp
copying colossalai/amp/naive_amp/naive_amp.py -> build/lib.linux-x86_64-cpython-39/colossalai/amp/naive_amp
copying colossalai/amp/naive_amp/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/amp/naive_amp
copying colossalai/amp/naive_amp/_fp16_optimizer.py -> build/lib.linux-x86_64-cpython-39/colossalai/amp/naive_amp
copying colossalai/amp/naive_amp/_utils.py -> build/lib.linux-x86_64-cpython-39/colossalai/amp/naive_amp
creating build/lib.linux-x86_64-cpython-39/colossalai/amp/apex_amp
copying colossalai/amp/apex_amp/apex_amp.py -> build/lib.linux-x86_64-cpython-39/colossalai/amp/apex_amp
copying colossalai/amp/apex_amp/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/amp/apex_amp
creating build/lib.linux-x86_64-cpython-39/colossalai/amp/naive_amp/grad_scaler
copying colossalai/amp/naive_amp/grad_scaler/constant_grad_scaler.py -> build/lib.linux-x86_64-cpython-39/colossalai/amp/naive_amp/grad_scaler
copying colossalai/amp/naive_amp/grad_scaler/base_grad_scaler.py -> build/lib.linux-x86_64-cpython-39/colossalai/amp/naive_amp/grad_scaler
copying colossalai/amp/naive_amp/grad_scaler/dynamic_grad_scaler.py -> build/lib.linux-x86_64-cpython-39/colossalai/amp/naive_amp/grad_scaler
copying colossalai/amp/naive_amp/grad_scaler/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/amp/naive_amp/grad_scaler
creating build/lib.linux-x86_64-cpython-39/colossalai/kernel/jit
copying colossalai/kernel/jit/option.py -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/jit
copying colossalai/kernel/jit/bias_gelu.py -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/jit
copying colossalai/kernel/jit/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/jit
copying colossalai/kernel/jit/bias_dropout_add.py -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/jit
creating build/lib.linux-x86_64-cpython-39/colossalai/kernel/op_builder
copying colossalai/kernel/op_builder/scaled_masked_softmax.py -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/op_builder
copying colossalai/kernel/op_builder/multi_head_attn.py -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/op_builder
copying colossalai/kernel/op_builder/utils.py -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/op_builder
copying colossalai/kernel/op_builder/builder.py -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/op_builder
copying colossalai/kernel/op_builder/fused_optim.py -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/op_builder
copying colossalai/kernel/op_builder/layernorm.py -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/op_builder
copying colossalai/kernel/op_builder/moe.py -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/op_builder
copying colossalai/kernel/op_builder/scaled_upper_triangle_masked_softmax.py -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/op_builder
copying colossalai/kernel/op_builder/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/op_builder
copying colossalai/kernel/op_builder/cpu_adam.py -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/op_builder
creating build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native
copying colossalai/kernel/cuda_native/multihead_attention.py -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native
copying colossalai/kernel/cuda_native/flash_attention.py -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native
copying colossalai/kernel/cuda_native/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native
copying colossalai/kernel/cuda_native/layer_norm.py -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native
copying colossalai/kernel/cuda_native/scaled_softmax.py -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native
creating build/lib.linux-x86_64-cpython-39/colossalai/pipeline/middleware
copying colossalai/pipeline/middleware/topo.py -> build/lib.linux-x86_64-cpython-39/colossalai/pipeline/middleware
copying colossalai/pipeline/middleware/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/pipeline/middleware
creating build/lib.linux-x86_64-cpython-39/colossalai/pipeline/rpc
copying colossalai/pipeline/rpc/_pipeline_base.py -> build/lib.linux-x86_64-cpython-39/colossalai/pipeline/rpc
copying colossalai/pipeline/rpc/utils.py -> build/lib.linux-x86_64-cpython-39/colossalai/pipeline/rpc
copying colossalai/pipeline/rpc/_pipeline_schedule.py -> build/lib.linux-x86_64-cpython-39/colossalai/pipeline/rpc
copying colossalai/pipeline/rpc/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/pipeline/rpc
creating build/lib.linux-x86_64-cpython-39/colossalai/pipeline/middleware/adaptor
copying colossalai/pipeline/middleware/adaptor/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/pipeline/middleware/adaptor
copying colossalai/pipeline/middleware/adaptor/fx.py -> build/lib.linux-x86_64-cpython-39/colossalai/pipeline/middleware/adaptor
creating build/lib.linux-x86_64-cpython-39/colossalai/gemini/paramhooks
copying colossalai/gemini/paramhooks/_param_hookmgr.py -> build/lib.linux-x86_64-cpython-39/colossalai/gemini/paramhooks
copying colossalai/gemini/paramhooks/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/gemini/paramhooks
creating build/lib.linux-x86_64-cpython-39/colossalai/gemini/chunk
copying colossalai/gemini/chunk/chunk.py -> build/lib.linux-x86_64-cpython-39/colossalai/gemini/chunk
copying colossalai/gemini/chunk/utils.py -> build/lib.linux-x86_64-cpython-39/colossalai/gemini/chunk
copying colossalai/gemini/chunk/manager.py -> build/lib.linux-x86_64-cpython-39/colossalai/gemini/chunk
copying colossalai/gemini/chunk/search_utils.py -> build/lib.linux-x86_64-cpython-39/colossalai/gemini/chunk
copying colossalai/gemini/chunk/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/gemini/chunk
creating build/lib.linux-x86_64-cpython-39/colossalai/gemini/ophooks
copying colossalai/gemini/ophooks/_shard_grad_ophook.py -> build/lib.linux-x86_64-cpython-39/colossalai/gemini/ophooks
copying colossalai/gemini/ophooks/_shard_param_ophook.py -> build/lib.linux-x86_64-cpython-39/colossalai/gemini/ophooks
copying colossalai/gemini/ophooks/utils.py -> build/lib.linux-x86_64-cpython-39/colossalai/gemini/ophooks
copying colossalai/gemini/ophooks/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/gemini/ophooks
copying colossalai/gemini/ophooks/runtime_mem_tracer_hook.py -> build/lib.linux-x86_64-cpython-39/colossalai/gemini/ophooks
creating build/lib.linux-x86_64-cpython-39/colossalai/gemini/memory_tracer
copying colossalai/gemini/memory_tracer/param_runtime_order.py -> build/lib.linux-x86_64-cpython-39/colossalai/gemini/memory_tracer
copying colossalai/gemini/memory_tracer/utils.py -> build/lib.linux-x86_64-cpython-39/colossalai/gemini/memory_tracer
copying colossalai/gemini/memory_tracer/static_memstats_collector.py -> build/lib.linux-x86_64-cpython-39/colossalai/gemini/memory_tracer
copying colossalai/gemini/memory_tracer/memstats_collector.py -> build/lib.linux-x86_64-cpython-39/colossalai/gemini/memory_tracer
copying colossalai/gemini/memory_tracer/runtime_mem_tracer.py -> build/lib.linux-x86_64-cpython-39/colossalai/gemini/memory_tracer
copying colossalai/gemini/memory_tracer/chunk_memstats_collector.py -> build/lib.linux-x86_64-cpython-39/colossalai/gemini/memory_tracer
copying colossalai/gemini/memory_tracer/memory_monitor.py -> build/lib.linux-x86_64-cpython-39/colossalai/gemini/memory_tracer
copying colossalai/gemini/memory_tracer/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/gemini/memory_tracer
copying colossalai/gemini/memory_tracer/memory_stats.py -> build/lib.linux-x86_64-cpython-39/colossalai/gemini/memory_tracer
creating build/lib.linux-x86_64-cpython-39/colossalai/context/process_group_initializer
copying colossalai/context/process_group_initializer/initializer_1d.py -> build/lib.linux-x86_64-cpython-39/colossalai/context/process_group_initializer
copying colossalai/context/process_group_initializer/initializer_3d.py -> build/lib.linux-x86_64-cpython-39/colossalai/context/process_group_initializer
copying colossalai/context/process_group_initializer/initializer_data.py -> build/lib.linux-x86_64-cpython-39/colossalai/context/process_group_initializer
copying colossalai/context/process_group_initializer/initializer_model.py -> build/lib.linux-x86_64-cpython-39/colossalai/context/process_group_initializer
copying colossalai/context/process_group_initializer/initializer_sequence.py -> build/lib.linux-x86_64-cpython-39/colossalai/context/process_group_initializer
copying colossalai/context/process_group_initializer/process_group_initializer.py -> build/lib.linux-x86_64-cpython-39/colossalai/context/process_group_initializer
copying colossalai/context/process_group_initializer/initializer_2p5d.py -> build/lib.linux-x86_64-cpython-39/colossalai/context/process_group_initializer
copying colossalai/context/process_group_initializer/initializer_2d.py -> build/lib.linux-x86_64-cpython-39/colossalai/context/process_group_initializer
copying colossalai/context/process_group_initializer/initializer_tensor.py -> build/lib.linux-x86_64-cpython-39/colossalai/context/process_group_initializer
copying colossalai/context/process_group_initializer/initializer_pipeline.py -> build/lib.linux-x86_64-cpython-39/colossalai/context/process_group_initializer
copying colossalai/context/process_group_initializer/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/context/process_group_initializer
creating build/lib.linux-x86_64-cpython-39/colossalai/context/random
copying colossalai/context/random/seed_manager.py -> build/lib.linux-x86_64-cpython-39/colossalai/context/random
copying colossalai/context/random/_helper.py -> build/lib.linux-x86_64-cpython-39/colossalai/context/random
copying colossalai/context/random/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/context/random
creating build/lib.linux-x86_64-cpython-39/colossalai/engine/schedule
copying colossalai/engine/schedule/_non_pipeline_schedule.py -> build/lib.linux-x86_64-cpython-39/colossalai/engine/schedule
copying colossalai/engine/schedule/_pipeline_schedule_v2.py -> build/lib.linux-x86_64-cpython-39/colossalai/engine/schedule
copying colossalai/engine/schedule/_pipeline_schedule.py -> build/lib.linux-x86_64-cpython-39/colossalai/engine/schedule
copying colossalai/engine/schedule/_base_schedule.py -> build/lib.linux-x86_64-cpython-39/colossalai/engine/schedule
copying colossalai/engine/schedule/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/engine/schedule
creating build/lib.linux-x86_64-cpython-39/colossalai/engine/gradient_accumulation
copying colossalai/engine/gradient_accumulation/_gradient_accumulation.py -> build/lib.linux-x86_64-cpython-39/colossalai/engine/gradient_accumulation
copying colossalai/engine/gradient_accumulation/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/engine/gradient_accumulation
creating build/lib.linux-x86_64-cpython-39/colossalai/engine/gradient_handler
copying colossalai/engine/gradient_handler/_moe_gradient_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/engine/gradient_handler
copying colossalai/engine/gradient_handler/utils.py -> build/lib.linux-x86_64-cpython-39/colossalai/engine/gradient_handler
copying colossalai/engine/gradient_handler/_pipeline_parallel_gradient_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/engine/gradient_handler
copying colossalai/engine/gradient_handler/_data_parallel_gradient_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/engine/gradient_handler
copying colossalai/engine/gradient_handler/_zero_gradient_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/engine/gradient_handler
copying colossalai/engine/gradient_handler/_sequence_parallel_gradient_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/engine/gradient_handler
copying colossalai/engine/gradient_handler/_base_gradient_handler.py -> build/lib.linux-x86_64-cpython-39/colossalai/engine/gradient_handler
copying colossalai/engine/gradient_handler/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/engine/gradient_handler
creating build/lib.linux-x86_64-cpython-39/colossalai/trainer/hooks
copying colossalai/trainer/hooks/_checkpoint_hook.py -> build/lib.linux-x86_64-cpython-39/colossalai/trainer/hooks
copying colossalai/trainer/hooks/_commons_.py -> build/lib.linux-x86_64-cpython-39/colossalai/trainer/hooks
copying colossalai/trainer/hooks/_lr_scheduler_hook.py -> build/lib.linux-x86_64-cpython-39/colossalai/trainer/hooks
copying colossalai/trainer/hooks/_base_hook.py -> build/lib.linux-x86_64-cpython-39/colossalai/trainer/hooks
copying colossalai/trainer/hooks/_metric_hook.py -> build/lib.linux-x86_64-cpython-39/colossalai/trainer/hooks
copying colossalai/trainer/hooks/__init__.py -> build/lib.linux-x86_64-cpython-39/colossalai/trainer/hooks
copying colossalai/trainer/hooks/_log_hook.py -> build/lib.linux-x86_64-cpython-39/colossalai/trainer/hooks
creating build/lib.linux-x86_64-cpython-39/tests
creating build/lib.linux-x86_64-cpython-39/tests/components_to_test
copying tests/components_to_test/beit.py -> build/lib.linux-x86_64-cpython-39/tests/components_to_test
copying tests/components_to_test/resnet.py -> build/lib.linux-x86_64-cpython-39/tests/components_to_test
copying tests/components_to_test/bert.py -> build/lib.linux-x86_64-cpython-39/tests/components_to_test
copying tests/components_to_test/registry.py -> build/lib.linux-x86_64-cpython-39/tests/components_to_test
copying tests/components_to_test/simple_net.py -> build/lib.linux-x86_64-cpython-39/tests/components_to_test
copying tests/components_to_test/repeated_computed_layers.py -> build/lib.linux-x86_64-cpython-39/tests/components_to_test
copying tests/components_to_test/hanging_param_model.py -> build/lib.linux-x86_64-cpython-39/tests/components_to_test
copying tests/components_to_test/albert.py -> build/lib.linux-x86_64-cpython-39/tests/components_to_test
copying tests/components_to_test/inline_op_model.py -> build/lib.linux-x86_64-cpython-39/tests/components_to_test
copying tests/components_to_test/nested_model.py -> build/lib.linux-x86_64-cpython-39/tests/components_to_test
copying tests/components_to_test/__init__.py -> build/lib.linux-x86_64-cpython-39/tests/components_to_test
copying tests/components_to_test/gpt2.py -> build/lib.linux-x86_64-cpython-39/tests/components_to_test
creating build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel
copying tests/test_auto_parallel/__init__.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel
creating build/lib.linux-x86_64-cpython-39/tests/components_to_test/utils
copying tests/components_to_test/utils/dummy_data_generator.py -> build/lib.linux-x86_64-cpython-39/tests/components_to_test/utils
copying tests/components_to_test/utils/__init__.py -> build/lib.linux-x86_64-cpython-39/tests/components_to_test/utils
copying tests/components_to_test/utils/executor.py -> build/lib.linux-x86_64-cpython-39/tests/components_to_test/utils
creating build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard
copying tests/test_auto_parallel/test_tensor_shard/test_param_resharding_cost.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard
copying tests/test_auto_parallel/test_tensor_shard/test_shape_consistency_pass.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard
copying tests/test_auto_parallel/test_tensor_shard/test_liveness_analysis.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard
copying tests/test_auto_parallel/test_tensor_shard/test_broadcast.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard
copying tests/test_auto_parallel/test_tensor_shard/test_find_repeat_block.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard
copying tests/test_auto_parallel/test_tensor_shard/test_compatibility_with_ddp.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard
copying tests/test_auto_parallel/test_tensor_shard/test_checkpoint.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard
copying tests/test_auto_parallel/test_tensor_shard/test_compatibility_with_gemini.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard
copying tests/test_auto_parallel/test_tensor_shard/__init__.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard
copying tests/test_auto_parallel/test_tensor_shard/test_bias_addition_forward.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard
copying tests/test_auto_parallel/test_tensor_shard/test_solver_with_resnet_v2.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard
creating build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_pass
copying tests/test_auto_parallel/test_pass/test_size_value_converting_pass.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_pass
copying tests/test_auto_parallel/test_pass/test_node_converting_pass.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_pass
copying tests/test_auto_parallel/test_pass/__init__.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_pass
creating build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_gpt
copying tests/test_auto_parallel/test_tensor_shard/test_gpt/gpt_modules.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_gpt
copying tests/test_auto_parallel/test_tensor_shard/test_gpt/test_runtime_with_gpt_modules.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_gpt
copying tests/test_auto_parallel/test_tensor_shard/test_gpt/test_solver_with_gpt_module.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_gpt
copying tests/test_auto_parallel/test_tensor_shard/test_gpt/__init__.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_gpt
creating build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_addmm_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_where_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_shard_option.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_bias_linear_function_node.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_permute_and_transpose_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_bias_linear_module_node.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_getattr_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_binary_elementwise_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/utils.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_sum_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_layer_norm_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_embedding_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_getitem_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_norm_pooling_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_view_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_default_reshape_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_conv_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_addbmm_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_split_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_tensor_constructor.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_output_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_softmax_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_linear_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/__init__.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_placeholder_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_unary_element_wise_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_bmm_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_matmul_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
copying tests/test_auto_parallel/test_tensor_shard/test_node_handler/test_batch_norm_handler.py -> build/lib.linux-x86_64-cpython-39/tests/test_auto_parallel/test_tensor_shard/test_node_handler
creating build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/multi_tensor_apply.cuh -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/multi_tensor_scale_kernel.cu -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/multi_tensor_l2norm_kernel.cu -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/layer_norm_cuda_kernel.cu -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/scaled_upper_triang_masked_softmax.h -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/multi_tensor_sgd_kernel.cu -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/scaled_upper_triang_masked_softmax.cpp -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/multi_tensor_lamb.cu -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/multi_tensor_adam.cu -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/scaled_masked_softmax.h -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/cpu_adam.h -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/layer_norm_cuda.cpp -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/multihead_attention_1d.cpp -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/cpu_adam.cpp -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/type_shim.h -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/scaled_upper_triang_masked_softmax_cuda.cu -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/moe_cuda_kernel.cu -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/scaled_masked_softmax.cpp -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/scaled_masked_softmax_cuda.cu -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/moe_cuda.cpp -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/compat.h -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/multihead_attention_1d.h -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
copying colossalai/kernel/cuda_native/csrc/colossal_C_frontend.cpp -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc
creating build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc/kernels
creating build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc/kernels/include
copying colossalai/kernel/cuda_native/csrc/kernels/include/block_reduce.h -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc/kernels/include
copying colossalai/kernel/cuda_native/csrc/kernels/include/strided_batch_gemm.h -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc/kernels/include
copying colossalai/kernel/cuda_native/csrc/kernels/include/dropout.h -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc/kernels/include
copying colossalai/kernel/cuda_native/csrc/kernels/include/feed_forward.h -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc/kernels/include
copying colossalai/kernel/cuda_native/csrc/kernels/include/normalize_layer.h -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc/kernels/include
copying colossalai/kernel/cuda_native/csrc/kernels/include/cuda_util.h -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc/kernels/include
copying colossalai/kernel/cuda_native/csrc/kernels/include/kernels.h -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc/kernels/include
copying colossalai/kernel/cuda_native/csrc/kernels/include/ls_cub.cuh -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc/kernels/include
copying colossalai/kernel/cuda_native/csrc/kernels/include/context.h -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc/kernels/include
copying colossalai/kernel/cuda_native/csrc/kernels/include/softmax.h -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc/kernels/include
copying colossalai/kernel/cuda_native/csrc/kernels/include/cross_entropy_layer.h -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc/kernels/include
copying colossalai/kernel/cuda_native/csrc/kernels/include/cublas_wrappers.h -> build/lib.linux-x86_64-cpython-39/colossalai/kernel/cuda_native/csrc/kernels/include
running build_ext
/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/utils/cpp_extension.py:387: UserWarning: The detected CUDA version (11.0) has a minor version mismatch with the version that was used to compile PyTorch (11.6). Most likely this shouldn't be a problem.
warnings.warn(CUDA_MISMATCH_WARN.format(cuda_str_version, torch.version.cuda))
/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/utils/cpp_extension.py:397: UserWarning: There are no g++ version bounds defined for CUDA version 11.0
warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
building 'colossalai._C.cpu_adam' extension
creating /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39
creating /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home
creating /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01
creating /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI
creating /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai
creating /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel
creating /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native
creating /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc
Emitting ninja build file /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] c++ -MMD -MF /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/cpu_adam.o.d -pthread -B /home/liuzixi01/.conda/envs/torch-cuda116/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/liuzixi01/.conda/envs/torch-cuda116/include -fPIC -O2 -isystem /home/liuzixi01/.conda/envs/torch-cuda116/include -fPIC -I/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/includes -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/TH -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/include/python3.9 -c -c /home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/cpu_adam.cpp -o /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/cpu_adam.o -O3 -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -std=c++14 -lcudart -lcublas -g -Wno-reorder -fopenmp -march=native -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=cpu_adam -D_GLIBCXX_USE_CXX11_ABI=0
/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/cpu_adam.cpp:244:0: warning: ignoring #pragma unroll [-Wunknown-pragmas]
#pragma unroll 4
/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/cpu_adam.cpp:364:0: warning: ignoring #pragma unroll [-Wunknown-pragmas]
#pragma unroll 8
In file included from /home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/torch/csrc/Exceptions.h:13:0,
from /home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/torch/csrc/api/include/torch/python.h:11,
from /home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/torch/extension.h:6,
from /home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/cpu_adam.h:29,
from /home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/cpu_adam.cpp:22:
/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/pybind11/pybind11.h: In instantiation of ‘class pybind11::class_<Adam_Optimizer>’:
/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/cpu_adam.cpp:456:51: required from here
/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/pybind11/pybind11.h:1479:7: warning: ‘pybind11::class_<Adam_Optimizer>’ declared with greater visibility than the type of its field ‘pybind11::class_<Adam_Optimizer>::<anonymous>’ [-Wattributes]
class class_ : public detail::generic_type {
^~~~~~
/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/pybind11/pybind11.h:1479:7: warning: ‘pybind11::class_<Adam_Optimizer>’ declared with greater visibility than its base ‘pybind11::detail::generic_type’ [-Wattributes]
g++ -pthread -B /home/liuzixi01/.conda/envs/torch-cuda116/compiler_compat -shared -Wl,--allow-shlib-undefined -Wl,-rpath,/home/liuzixi01/.conda/envs/torch-cuda116/lib -Wl,-rpath-link,/home/liuzixi01/.conda/envs/torch-cuda116/lib -L/home/liuzixi01/.conda/envs/torch-cuda116/lib -Wl,--allow-shlib-undefined -Wl,-rpath,/home/liuzixi01/.conda/envs/torch-cuda116/lib -Wl,-rpath-link,/home/liuzixi01/.conda/envs/torch-cuda116/lib -L/home/liuzixi01/.conda/envs/torch-cuda116/lib /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/cpu_adam.o -L/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/lib -L/usr/local/cuda/lib64 -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda_cu -ltorch_cuda_cpp -o build/lib.linux-x86_64-cpython-39/colossalai/_C/cpu_adam.cpython-39-x86_64-linux-gnu.so
building 'colossalai._C.fused_optim' extension
Emitting ninja build file /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/6] /usr/local/cuda/bin/nvcc -I/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/kernels/include -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/TH -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/include/python3.9 -c -c /home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_adam.cu -o /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_adam.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=fused_optim -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
FAILED: /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_adam.o
/usr/local/cuda/bin/nvcc -I/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/kernels/include -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/TH -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/include/python3.9 -c -c /home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_adam.cu -o /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_adam.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=fused_optim -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
nvcc fatal : Unsupported gpu architecture 'compute_86'
[2/6] /usr/local/cuda/bin/nvcc -I/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/kernels/include -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/TH -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/include/python3.9 -c -c /home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_l2norm_kernel.cu -o /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_l2norm_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=fused_optim -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
FAILED: /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_l2norm_kernel.o
/usr/local/cuda/bin/nvcc -I/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/kernels/include -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/TH -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/include/python3.9 -c -c /home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_l2norm_kernel.cu -o /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_l2norm_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=fused_optim -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
nvcc fatal : Unsupported gpu architecture 'compute_86'
[3/6] /usr/local/cuda/bin/nvcc -I/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/kernels/include -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/TH -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/include/python3.9 -c -c /home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_lamb.cu -o /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_lamb.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=fused_optim -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
FAILED: /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_lamb.o
/usr/local/cuda/bin/nvcc -I/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/kernels/include -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/TH -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/include/python3.9 -c -c /home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_lamb.cu -o /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_lamb.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=fused_optim -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
nvcc fatal : Unsupported gpu architecture 'compute_86'
[4/6] /usr/local/cuda/bin/nvcc -I/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/kernels/include -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/TH -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/include/python3.9 -c -c /home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_scale_kernel.cu -o /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_scale_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=fused_optim -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
FAILED: /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_scale_kernel.o
/usr/local/cuda/bin/nvcc -I/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/kernels/include -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/TH -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/include/python3.9 -c -c /home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_scale_kernel.cu -o /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_scale_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=fused_optim -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
nvcc fatal : Unsupported gpu architecture 'compute_86'
[5/6] /usr/local/cuda/bin/nvcc -I/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/kernels/include -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/TH -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/include/python3.9 -c -c /home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_sgd_kernel.cu -o /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_sgd_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=fused_optim -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
FAILED: /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_sgd_kernel.o
/usr/local/cuda/bin/nvcc -I/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/kernels/include -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/TH -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/include/python3.9 -c -c /home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_sgd_kernel.cu -o /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/multi_tensor_sgd_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 --use_fast_math -lineinfo -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=fused_optim -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
nvcc fatal : Unsupported gpu architecture 'compute_86'
[6/6] c++ -MMD -MF /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/colossal_C_frontend.o.d -pthread -B /home/liuzixi01/.conda/envs/torch-cuda116/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/liuzixi01/.conda/envs/torch-cuda116/include -fPIC -O2 -isystem /home/liuzixi01/.conda/envs/torch-cuda116/include -fPIC -I/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/kernels/include -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/TH -I/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/liuzixi01/.conda/envs/torch-cuda116/include/python3.9 -c -c /home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/colossal_C_frontend.cpp -o /home/liuzixi01/ColossalAI/build/temp.linux-x86_64-cpython-39/home/liuzixi01/ColossalAI/colossalai/kernel/cuda_native/csrc/colossal_C_frontend.o -O3 -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=fused_optim -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1900, in _run_ninja_build
subprocess.run(
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/subprocess.py", line 528, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/home/liuzixi01/ColossalAI/setup.py", line 170, in <module>
setup(name=package_name,
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/setuptools/__init__.py", line 87, in setup
return distutils.core.setup(**attrs)
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 185, in setup
return run_commands(dist)
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
dist.run_commands()
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 968, in run_commands
self.run_command(cmd)
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/setuptools/dist.py", line 1217, in run_command
super().run_command(command)
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
cmd_obj.run()
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/setuptools/command/install.py", line 68, in run
return orig.install.run(self)
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/setuptools/_distutils/command/install.py", line 698, in run
self.run_command('build')
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/setuptools/_distutils/cmd.py", line 319, in run_command
self.distribution.run_command(command)
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/setuptools/dist.py", line 1217, in run_command
super().run_command(command)
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
cmd_obj.run()
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/setuptools/_distutils/command/build.py", line 132, in run
self.run_command(cmd_name)
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/setuptools/_distutils/cmd.py", line 319, in run_command
self.distribution.run_command(command)
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/setuptools/dist.py", line 1217, in run_command
super().run_command(command)
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
cmd_obj.run()
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/setuptools/command/build_ext.py", line 84, in run
_build_ext.run(self)
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run
self.build_extensions()
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 843, in build_extensions
build_ext.build_extensions(self)
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/setuptools/_distutils/command/build_ext.py", line 466, in build_extensions
self._build_extensions_serial()
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/setuptools/_distutils/command/build_ext.py", line 492, in _build_extensions_serial
self.build_extension(ext)
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/setuptools/command/build_ext.py", line 246, in build_extension
_build_ext.build_extension(self, ext)
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/setuptools/_distutils/command/build_ext.py", line 547, in build_extension
objects = self.compiler.compile(
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 658, in unix_wrap_ninja_compile
_write_ninja_file_and_compile_objects(
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1573, in _write_ninja_file_and_compile_objects
_run_ninja_build(
File "/home/liuzixi01/.conda/envs/torch-cuda116/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1916, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure
× Encountered error while trying to install package.
╰─> colossalai
note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.
请问你是怎么解决的呢
Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑🤝🧑👫🧑🏿🤝🧑🏻👩🏾🤝👨🏿👬🏿
How did you solve it?
running into a similar issue in apex installation, how is it solved?
希望能得到您的解决方案
Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑🤝🧑👫🧑🏿🤝🧑🏻👩🏾🤝👨🏿👬🏿
Hope to get your solution
any idea?
🐛 Describe the bug
参考官方教程运行:CUDA_EXT=1 pip install colossalai。运行环境:conda虚拟环境,python=3.9.13,pytorch=1.13+cuda11.6,显卡Nvidia A30*3。报错如下:
bug太长写不下,下面评论补充
Environment
conda虚拟环境,python=3.9.13,pytorch=1.13+cuda11.6,显卡Nvidia A30*3