hpcaitech / ColossalAI

Making large AI models cheaper, faster and more accessible
https://www.colossalai.org
Apache License 2.0
38.78k stars 4.34k forks source link

[BUG]: Error when runnning ColossalAI/blob/main/examples/tutorial/opt/inference/opt_fastapi.py #2875

Closed 0-1CxH closed 1 year ago

0-1CxH commented 1 year ago

🐛 Describe the bug

The https://github.com/hpcaitech/ColossalAI/blob/main/examples/tutorial/opt/inference/opt_fastapi.py requires energonai library, as I install this library like https://github.com/hpcaitech/EnergonAI says (git clone and then pip install .), it seems the compilation encounters a problem, here is the complete log:

Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Processing /root/gpt_exp/opt_colossal/EnergonAI-main
  Preparing metadata (setup.py) ... done
Building wheels for collected packages: energonai
  Building wheel for energonai (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [106 lines of output]

      torch.__version__  = 1.10.1+cu102

      Compiling cuda extensions with
      nvcc: NVIDIA (R) Cuda compiler driver
      Copyright (c) 2005-2019 NVIDIA Corporation
      Built on Wed_Oct_23_19:24:38_PDT_2019
      Cuda compilation tools, release 10.2, V10.2.89
      from /usr/local/cuda/bin

      running bdist_wheel
      running build
      running build_py
      running build_ext
      building 'energonai_scale_mask' extension
      /home/kg/anaconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py:298: UserWarning:

                                     !! WARNING !!

      !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
      Your compiler (c++) is not compatible with the compiler Pytorch was
      built with for this platform, which is g++ on linux. Please
      use g++ to to compile your extension. Alternatively, you may
      compile PyTorch from source using c++, and then you can also use
      c++ to compile your extension.

      See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help
      with compiling PyTorch from source.
      !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

                                    !! WARNING !!

        platform=sys.platform))
      Emitting ninja build file /root/gpt_exp/opt_colossal/EnergonAI-main/build/temp.linux-x86_64-3.7/build.ninja...
      Compiling objects...
      Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
      [1/1] /usr/local/cuda/bin/nvcc  -I/root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc -I/home/kg/anaconda3/lib/python3.7/site-packages/torch/include -I/home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/kg/anaconda3/lib/python3.7/site-packages/torch/include/TH -I/home/kg/anaconda3/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/kg/anaconda3/include/python3.7m -c -c /root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc/scale_mask_softmax_kernel.cu -o /root/gpt_exp/opt_colossal/EnergonAI-main/build/temp.linux-x86_64-3.7/energonai/kernel/cuda_native/csrc/scale_mask_softmax_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 --use_fast_math -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -DUSE_C10D_NCCL -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -DTHRUST_IGNORE_CUB_VERSION_CHECK -gencode arch=compute_70,code=sm_70 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=energonai_scale_mask -D_GLIBCXX_USE_CXX11_ABI=0
      FAILED: /root/gpt_exp/opt_colossal/EnergonAI-main/build/temp.linux-x86_64-3.7/energonai/kernel/cuda_native/csrc/scale_mask_softmax_kernel.o
      /usr/local/cuda/bin/nvcc  -I/root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc -I/home/kg/anaconda3/lib/python3.7/site-packages/torch/include -I/home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/kg/anaconda3/lib/python3.7/site-packages/torch/include/TH -I/home/kg/anaconda3/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/kg/anaconda3/include/python3.7m -c -c /root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc/scale_mask_softmax_kernel.cu -o /root/gpt_exp/opt_colossal/EnergonAI-main/build/temp.linux-x86_64-3.7/energonai/kernel/cuda_native/csrc/scale_mask_softmax_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 --use_fast_math -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -DUSE_C10D_NCCL -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -DTHRUST_IGNORE_CUB_VERSION_CHECK -gencode arch=compute_70,code=sm_70 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=energonai_scale_mask -D_GLIBCXX_USE_CXX11_ABI=0
      /root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc/scale_mask_softmax_kernel.cu:5:10: fatal error: cub/cub.cuh: No such file or directory
       #include <cub/cub.cuh>
                ^~~~~~~~~~~~~
      compilation terminated.
      ninja: build stopped: subcommand failed.
      Traceback (most recent call last):
        File "/home/kg/anaconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1723, in _run_ninja_build
          env=env)
        File "/home/kg/anaconda3/lib/python3.7/subprocess.py", line 487, in run
          output=stdout, stderr=stderr)
      subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

      The above exception was the direct cause of the following exception:

      Traceback (most recent call last):
        File "<string>", line 36, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/root/gpt_exp/opt_colossal/EnergonAI-main/setup.py", line 187, in <module>
          'console_scripts': ['energonai=energonai.cli:typer_click_object', ],
        File "/home/kg/anaconda3/lib/python3.7/site-packages/setuptools/__init__.py", line 153, in setup
          return distutils.core.setup(**attrs)
        File "/home/kg/anaconda3/lib/python3.7/distutils/core.py", line 148, in setup
          dist.run_commands()
        File "/home/kg/anaconda3/lib/python3.7/distutils/dist.py", line 966, in run_commands
          self.run_command(cmd)
        File "/home/kg/anaconda3/lib/python3.7/distutils/dist.py", line 985, in run_command
          cmd_obj.run()
        File "/home/kg/anaconda3/lib/python3.7/site-packages/wheel/bdist_wheel.py", line 299, in run
          self.run_command('build')
        File "/home/kg/anaconda3/lib/python3.7/distutils/cmd.py", line 313, in run_command
          self.distribution.run_command(command)
        File "/home/kg/anaconda3/lib/python3.7/distutils/dist.py", line 985, in run_command
          cmd_obj.run()
        File "/home/kg/anaconda3/lib/python3.7/distutils/command/build.py", line 135, in run
          self.run_command(cmd_name)
        File "/home/kg/anaconda3/lib/python3.7/distutils/cmd.py", line 313, in run_command
          self.distribution.run_command(command)
        File "/home/kg/anaconda3/lib/python3.7/distutils/dist.py", line 985, in run_command
          cmd_obj.run()
        File "/home/kg/anaconda3/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 79, in run
          _build_ext.run(self)
        File "/home/kg/anaconda3/lib/python3.7/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run
          _build_ext.build_ext.run(self)
        File "/home/kg/anaconda3/lib/python3.7/distutils/command/build_ext.py", line 340, in run
          self.build_extensions()
        File "/home/kg/anaconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 735, in build_extensions
          build_ext.build_extensions(self)
        File "/home/kg/anaconda3/lib/python3.7/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions
          _build_ext.build_ext.build_extensions(self)
        File "/home/kg/anaconda3/lib/python3.7/distutils/command/build_ext.py", line 449, in build_extensions
          self._build_extensions_serial()
        File "/home/kg/anaconda3/lib/python3.7/distutils/command/build_ext.py", line 474, in _build_extensions_serial
          self.build_extension(ext)
        File "/home/kg/anaconda3/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 202, in build_extension
          _build_ext.build_extension(self, ext)
        File "/home/kg/anaconda3/lib/python3.7/distutils/command/build_ext.py", line 534, in build_extension
          depends=ext.depends)
        File "/home/kg/anaconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 565, in unix_wrap_ninja_compile
          with_cuda=with_cuda)
        File "/home/kg/anaconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1404, in _write_ninja_file_and_compile_objects
          error_prefix='Error compiling objects for extension')
        File "/home/kg/anaconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1733, in _run_ninja_build
          raise RuntimeError(message) from e
      RuntimeError: Error compiling objects for extension
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for energonai
  Running setup.py clean for energonai
Failed to build energonai
Installing collected packages: energonai
  Running setup.py install for energonai ... error
  error: subprocess-exited-with-error

  × Running setup.py install for energonai did not run successfully.
  │ exit code: 1
  ╰─> [272 lines of output]

      torch.__version__  = 1.10.1+cu102

      Compiling cuda extensions with
      nvcc: NVIDIA (R) Cuda compiler driver
      Copyright (c) 2005-2019 NVIDIA Corporation
      Built on Wed_Oct_23_19:24:38_PDT_2019
      Cuda compilation tools, release 10.2, V10.2.89
      from /usr/local/cuda/bin

      running install
      /home/kg/anaconda3/lib/python3.7/site-packages/setuptools/command/install.py:37: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
        setuptools.SetuptoolsDeprecationWarning,
      running build
      running build_py
      creating build
      creating build/lib.linux-x86_64-3.7
      creating build/lib.linux-x86_64-3.7/energonai
      copying energonai/worker.py -> build/lib.linux-x86_64-3.7/energonai
      copying energonai/task.py -> build/lib.linux-x86_64-3.7/energonai
      copying energonai/__init__.py -> build/lib.linux-x86_64-3.7/energonai
      copying energonai/engine.py -> build/lib.linux-x86_64-3.7/energonai
      copying energonai/batch_mgr.py -> build/lib.linux-x86_64-3.7/energonai
      copying energonai/pipe.py -> build/lib.linux-x86_64-3.7/energonai
      creating build/lib.linux-x86_64-3.7/energonai/utils
      copying energonai/utils/files.py -> build/lib.linux-x86_64-3.7/energonai/utils
      copying energonai/utils/checkpointing_hf_gpt2.py -> build/lib.linux-x86_64-3.7/energonai/utils
      copying energonai/utils/timer.py -> build/lib.linux-x86_64-3.7/energonai/utils
      copying energonai/utils/__init__.py -> build/lib.linux-x86_64-3.7/energonai/utils
      copying energonai/utils/common.py -> build/lib.linux-x86_64-3.7/energonai/utils
      copying energonai/utils/checkpointing_opt.py -> build/lib.linux-x86_64-3.7/energonai/utils
      copying energonai/utils/checkpointing.py -> build/lib.linux-x86_64-3.7/energonai/utils
      creating build/lib.linux-x86_64-3.7/energonai/testing
      copying energonai/testing/models.py -> build/lib.linux-x86_64-3.7/energonai/testing
      copying energonai/testing/__init__.py -> build/lib.linux-x86_64-3.7/energonai/testing
      creating build/lib.linux-x86_64-3.7/energonai/model
      copying energonai/model/attention.py -> build/lib.linux-x86_64-3.7/energonai/model
      copying energonai/model/embedding.py -> build/lib.linux-x86_64-3.7/energonai/model
      copying energonai/model/__init__.py -> build/lib.linux-x86_64-3.7/energonai/model
      copying energonai/model/mlp.py -> build/lib.linux-x86_64-3.7/energonai/model
      copying energonai/model/endecoder.py -> build/lib.linux-x86_64-3.7/energonai/model
      copying energonai/model/model_factory.py -> build/lib.linux-x86_64-3.7/energonai/model
      copying energonai/model/downstream.py -> build/lib.linux-x86_64-3.7/energonai/model
      creating build/lib.linux-x86_64-3.7/energonai/communication
      copying energonai/communication/collective.py -> build/lib.linux-x86_64-3.7/energonai/communication
      copying energonai/communication/p2p.py -> build/lib.linux-x86_64-3.7/energonai/communication
      copying energonai/communication/__init__.py -> build/lib.linux-x86_64-3.7/energonai/communication
      copying energonai/communication/utils.py -> build/lib.linux-x86_64-3.7/energonai/communication
      copying energonai/communication/ring.py -> build/lib.linux-x86_64-3.7/energonai/communication
      creating build/lib.linux-x86_64-3.7/energonai/legacy_batch_mgr
      copying energonai/legacy_batch_mgr/dynamic_batch_manager.py -> build/lib.linux-x86_64-3.7/energonai/legacy_batch_mgr
      copying energonai/legacy_batch_mgr/__init__.py -> build/lib.linux-x86_64-3.7/energonai/legacy_batch_mgr
      copying energonai/legacy_batch_mgr/naive_batch_manager.py -> build/lib.linux-x86_64-3.7/energonai/legacy_batch_mgr
      creating build/lib.linux-x86_64-3.7/energonai/pipelinable
      copying energonai/pipelinable/split_method.py -> build/lib.linux-x86_64-3.7/energonai/pipelinable
      copying energonai/pipelinable/__init__.py -> build/lib.linux-x86_64-3.7/energonai/pipelinable
      copying energonai/pipelinable/energon_tracer.py -> build/lib.linux-x86_64-3.7/energonai/pipelinable
      copying energonai/pipelinable/split_policy.py -> build/lib.linux-x86_64-3.7/energonai/pipelinable
      creating build/lib.linux-x86_64-3.7/energonai/kernel
      copying energonai/kernel/__init__.py -> build/lib.linux-x86_64-3.7/energonai/kernel
      creating build/lib.linux-x86_64-3.7/energonai/kernel/cuda_native
      copying energonai/kernel/cuda_native/__init__.py -> build/lib.linux-x86_64-3.7/energonai/kernel/cuda_native
      copying energonai/kernel/cuda_native/scale_mask_softmax.py -> build/lib.linux-x86_64-3.7/energonai/kernel/cuda_native
      copying energonai/kernel/cuda_native/transpose_pad.py -> build/lib.linux-x86_64-3.7/energonai/kernel/cuda_native
      copying energonai/kernel/cuda_native/linear_func.py -> build/lib.linux-x86_64-3.7/energonai/kernel/cuda_native
      copying energonai/kernel/cuda_native/layer_norm.py -> build/lib.linux-x86_64-3.7/energonai/kernel/cuda_native
      copying energonai/kernel/cuda_native/one_layer_norm.py -> build/lib.linux-x86_64-3.7/energonai/kernel/cuda_native
      running build_ext
      building 'energonai_scale_mask' extension
      creating /root/gpt_exp/opt_colossal/EnergonAI-main/build/temp.linux-x86_64-3.7
      creating /root/gpt_exp/opt_colossal/EnergonAI-main/build/temp.linux-x86_64-3.7/energonai
      creating /root/gpt_exp/opt_colossal/EnergonAI-main/build/temp.linux-x86_64-3.7/energonai/kernel
      creating /root/gpt_exp/opt_colossal/EnergonAI-main/build/temp.linux-x86_64-3.7/energonai/kernel/cuda_native
      creating /root/gpt_exp/opt_colossal/EnergonAI-main/build/temp.linux-x86_64-3.7/energonai/kernel/cuda_native/csrc
      /home/kg/anaconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py:298: UserWarning:

                                     !! WARNING !!

      !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
      Your compiler (c++) is not compatible with the compiler Pytorch was
      built with for this platform, which is g++ on linux. Please
      use g++ to to compile your extension. Alternatively, you may
      compile PyTorch from source using c++, and then you can also use
      c++ to compile your extension.

      See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help
      with compiling PyTorch from source.
      !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

                                    !! WARNING !!

        platform=sys.platform))
      Emitting ninja build file /root/gpt_exp/opt_colossal/EnergonAI-main/build/temp.linux-x86_64-3.7/build.ninja...
      Compiling objects...
      Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
      [1/2] /usr/local/cuda/bin/nvcc  -I/root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc -I/home/kg/anaconda3/lib/python3.7/site-packages/torch/include -I/home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/kg/anaconda3/lib/python3.7/site-packages/torch/include/TH -I/home/kg/anaconda3/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/kg/anaconda3/include/python3.7m -c -c /root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc/scale_mask_softmax_kernel.cu -o /root/gpt_exp/opt_colossal/EnergonAI-main/build/temp.linux-x86_64-3.7/energonai/kernel/cuda_native/csrc/scale_mask_softmax_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 --use_fast_math -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -DUSE_C10D_NCCL -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -DTHRUST_IGNORE_CUB_VERSION_CHECK -gencode arch=compute_70,code=sm_70 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=energonai_scale_mask -D_GLIBCXX_USE_CXX11_ABI=0
      FAILED: /root/gpt_exp/opt_colossal/EnergonAI-main/build/temp.linux-x86_64-3.7/energonai/kernel/cuda_native/csrc/scale_mask_softmax_kernel.o
      /usr/local/cuda/bin/nvcc  -I/root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc -I/home/kg/anaconda3/lib/python3.7/site-packages/torch/include -I/home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/kg/anaconda3/lib/python3.7/site-packages/torch/include/TH -I/home/kg/anaconda3/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/kg/anaconda3/include/python3.7m -c -c /root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc/scale_mask_softmax_kernel.cu -o /root/gpt_exp/opt_colossal/EnergonAI-main/build/temp.linux-x86_64-3.7/energonai/kernel/cuda_native/csrc/scale_mask_softmax_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 --use_fast_math -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -DUSE_C10D_NCCL -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -DTHRUST_IGNORE_CUB_VERSION_CHECK -gencode arch=compute_70,code=sm_70 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=energonai_scale_mask -D_GLIBCXX_USE_CXX11_ABI=0
      /root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc/scale_mask_softmax_kernel.cu:5:10: fatal error: cub/cub.cuh: No such file or directory
       #include <cub/cub.cuh>
                ^~~~~~~~~~~~~
      compilation terminated.
      [2/2] c++ -MMD -MF /root/gpt_exp/opt_colossal/EnergonAI-main/build/temp.linux-x86_64-3.7/energonai/kernel/cuda_native/csrc/scale_mask_softmax_wrapper.o.d -pthread -B /home/kg/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc -I/home/kg/anaconda3/lib/python3.7/site-packages/torch/include -I/home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/kg/anaconda3/lib/python3.7/site-packages/torch/include/TH -I/home/kg/anaconda3/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/kg/anaconda3/include/python3.7m -c -c /root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc/scale_mask_softmax_wrapper.cpp -o /root/gpt_exp/opt_colossal/EnergonAI-main/build/temp.linux-x86_64-3.7/energonai/kernel/cuda_native/csrc/scale_mask_softmax_wrapper.o -O3 -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -DUSE_C10D_NCCL -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=energonai_scale_mask -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
      cc1plus: warning: command line option '-Wstrict-prototypes' is valid for C/ObjC but not for C++
      In file included from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/c10/core/Device.h:5,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/c10/core/Allocator.h:6,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/ATen/ATen.h:7,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/extension.h:4,
                       from /root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc/scale_mask_softmax_wrapper.cpp:2:
      /root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc/scale_mask_softmax_wrapper.cpp: In function 'at::Tensor scale_mask_softmax_wrapper(int, int, int, at::Tensor, at::Tensor)':
      /root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc/scale_mask_softmax_wrapper.cpp:5:21: warning: 'at::DeprecatedTypeProperties& at::Tensor::type() const' is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
         AT_ASSERTM(x.type().is_cuda(), #x " must be a CUDA tensor")
                           ^
      /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/c10/util/Exception.h:241:39: note: in definition of macro 'C10_EXPAND_MSVC_WORKAROUND'
       #define C10_EXPAND_MSVC_WORKAROUND(x) x
                                             ^
      /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/c10/util/Exception.h:261:34: note: in expansion of macro 'C10_UNLIKELY'
       #define C10_UNLIKELY_OR_CONST(e) C10_UNLIKELY(e)
                                        ^~~~~~~~~~~~
      /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/c10/util/Exception.h:313:7: note: in expansion of macro 'C10_UNLIKELY_OR_CONST'
         if (C10_UNLIKELY_OR_CONST(!(cond))) {                                         \
             ^~~~~~~~~~~~~~~~~~~~~
      /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/c10/util/Exception.h:599:32: note: in expansion of macro 'TORCH_INTERNAL_ASSERT'
           C10_EXPAND_MSVC_WORKAROUND(TORCH_INTERNAL_ASSERT(cond, __VA_ARGS__)); \
                                      ^~~~~~~~~~~~~~~~~~~~~
      /root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc/scale_mask_softmax_wrapper.cpp:5:3: note: in expansion of macro 'AT_ASSERTM'
         AT_ASSERTM(x.type().is_cuda(), #x " must be a CUDA tensor")
         ^~~~~~~~~~
      /root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc/scale_mask_softmax_wrapper.cpp:13:3: note: in expansion of macro 'CHECK_CUDA'
         CHECK_CUDA(x);                                                               \
         ^~~~~~~~~~
      /root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc/scale_mask_softmax_wrapper.cpp:29:3: note: in expansion of macro 'CHECK_FP16_32_INPUT'
         CHECK_FP16_32_INPUT(correlation);
         ^~~~~~~~~~~~~~~~~~~
      In file included from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/ATen/Tensor.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/ATen/Context.h:4,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/ATen/ATen.h:9,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/extension.h:4,
                       from /root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc/scale_mask_softmax_wrapper.cpp:2:
      /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/ATen/core/TensorBody.h:194:30: note: declared here
         DeprecatedTypeProperties & type() const {
                                    ^~~~
      In file included from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/c10/core/Device.h:5,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/c10/core/Allocator.h:6,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/ATen/ATen.h:7,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/extension.h:4,
                       from /root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc/scale_mask_softmax_wrapper.cpp:2:
      /root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc/scale_mask_softmax_wrapper.cpp:5:21: warning: 'at::DeprecatedTypeProperties& at::Tensor::type() const' is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
         AT_ASSERTM(x.type().is_cuda(), #x " must be a CUDA tensor")
                           ^
      /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/c10/util/Exception.h:241:39: note: in definition of macro 'C10_EXPAND_MSVC_WORKAROUND'
       #define C10_EXPAND_MSVC_WORKAROUND(x) x
                                             ^
      /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/c10/util/Exception.h:261:34: note: in expansion of macro 'C10_UNLIKELY'
       #define C10_UNLIKELY_OR_CONST(e) C10_UNLIKELY(e)
                                        ^~~~~~~~~~~~
      /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/c10/util/Exception.h:313:7: note: in expansion of macro 'C10_UNLIKELY_OR_CONST'
         if (C10_UNLIKELY_OR_CONST(!(cond))) {                                         \
             ^~~~~~~~~~~~~~~~~~~~~
      /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/c10/util/Exception.h:599:32: note: in expansion of macro 'TORCH_INTERNAL_ASSERT'
           C10_EXPAND_MSVC_WORKAROUND(TORCH_INTERNAL_ASSERT(cond, __VA_ARGS__)); \
                                      ^~~~~~~~~~~~~~~~~~~~~
      /root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc/scale_mask_softmax_wrapper.cpp:5:3: note: in expansion of macro 'AT_ASSERTM'
         AT_ASSERTM(x.type().is_cuda(), #x " must be a CUDA tensor")
         ^~~~~~~~~~
      /root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc/scale_mask_softmax_wrapper.cpp:17:3: note: in expansion of macro 'CHECK_CUDA'
         CHECK_CUDA(x);                                                               \
         ^~~~~~~~~~
      /root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc/scale_mask_softmax_wrapper.cpp:30:3: note: in expansion of macro 'CHECK_INPUT'
         CHECK_INPUT(real_seq_len);
         ^~~~~~~~~~~
      In file included from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/ATen/Tensor.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/ATen/Context.h:4,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/ATen/ATen.h:9,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
                       from /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/torch/extension.h:4,
                       from /root/gpt_exp/opt_colossal/EnergonAI-main/energonai/kernel/cuda_native/csrc/scale_mask_softmax_wrapper.cpp:2:
      /home/kg/anaconda3/lib/python3.7/site-packages/torch/include/ATen/core/TensorBody.h:194:30: note: declared here
         DeprecatedTypeProperties & type() const {
                                    ^~~~
      ninja: build stopped: subcommand failed.
      Traceback (most recent call last):
        File "/home/kg/anaconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1723, in _run_ninja_build
          env=env)
        File "/home/kg/anaconda3/lib/python3.7/subprocess.py", line 487, in run
          output=stdout, stderr=stderr)
      subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

      The above exception was the direct cause of the following exception:

      Traceback (most recent call last):
        File "<string>", line 36, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/root/gpt_exp/opt_colossal/EnergonAI-main/setup.py", line 187, in <module>
          'console_scripts': ['energonai=energonai.cli:typer_click_object', ],
        File "/home/kg/anaconda3/lib/python3.7/site-packages/setuptools/__init__.py", line 153, in setup
          return distutils.core.setup(**attrs)
        File "/home/kg/anaconda3/lib/python3.7/distutils/core.py", line 148, in setup
          dist.run_commands()
        File "/home/kg/anaconda3/lib/python3.7/distutils/dist.py", line 966, in run_commands
          self.run_command(cmd)
        File "/home/kg/anaconda3/lib/python3.7/distutils/dist.py", line 985, in run_command
          cmd_obj.run()
        File "/home/kg/anaconda3/lib/python3.7/site-packages/setuptools/command/install.py", line 68, in run
          return orig.install.run(self)
        File "/home/kg/anaconda3/lib/python3.7/distutils/command/install.py", line 545, in run
          self.run_command('build')
        File "/home/kg/anaconda3/lib/python3.7/distutils/cmd.py", line 313, in run_command
          self.distribution.run_command(command)
        File "/home/kg/anaconda3/lib/python3.7/distutils/dist.py", line 985, in run_command
          cmd_obj.run()
        File "/home/kg/anaconda3/lib/python3.7/distutils/command/build.py", line 135, in run
          self.run_command(cmd_name)
        File "/home/kg/anaconda3/lib/python3.7/distutils/cmd.py", line 313, in run_command
          self.distribution.run_command(command)
        File "/home/kg/anaconda3/lib/python3.7/distutils/dist.py", line 985, in run_command
          cmd_obj.run()
        File "/home/kg/anaconda3/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 79, in run
          _build_ext.run(self)
        File "/home/kg/anaconda3/lib/python3.7/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run
          _build_ext.build_ext.run(self)
        File "/home/kg/anaconda3/lib/python3.7/distutils/command/build_ext.py", line 340, in run
          self.build_extensions()
        File "/home/kg/anaconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 735, in build_extensions
          build_ext.build_extensions(self)
        File "/home/kg/anaconda3/lib/python3.7/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions
          _build_ext.build_ext.build_extensions(self)
        File "/home/kg/anaconda3/lib/python3.7/distutils/command/build_ext.py", line 449, in build_extensions
          self._build_extensions_serial()
        File "/home/kg/anaconda3/lib/python3.7/distutils/command/build_ext.py", line 474, in _build_extensions_serial
          self.build_extension(ext)
        File "/home/kg/anaconda3/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 202, in build_extension
          _build_ext.build_extension(self, ext)
        File "/home/kg/anaconda3/lib/python3.7/distutils/command/build_ext.py", line 534, in build_extension
          depends=ext.depends)
        File "/home/kg/anaconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 565, in unix_wrap_ninja_compile
          with_cuda=with_cuda)
        File "/home/kg/anaconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1404, in _write_ninja_file_and_compile_objects
          error_prefix='Error compiling objects for extension')
        File "/home/kg/anaconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1733, in _run_ninja_build
          raise RuntimeError(message) from e
      RuntimeError: Error compiling objects for extension
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> energonai

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.

Environment

PyTorch 1.10.1 + cu102 Python 3.7.4

zixiliuUSC commented 1 year ago

pytorch 1.13.6+cu116 with python 3.9.13 also encounters similar building error, I run CUDA_EXT=1 pip install colossalai as official document provides to setup colossal.

binmakeswell commented 1 year ago

Hi @zixiliuUSC and @0-1CxH You can check the env we have tested at https://github.com/hpcaitech/ColossalAI#Installation Thanks.

binmakeswell commented 1 year ago

We have updated a lot. This issue was closed due to inactivity. Thanks.