Open zmtttt opened 4 days ago
Could you share how you're installing transformer-engine
?
Could you share how you're installing
transformer-engine
?
the official method:pip install --upgrade git+https://github.com/NVIDIA/TransformerEngine.git@stable but Fail to build wheel for transformer-engine
Could you post the full build log?
Could you post the full build log? Building CMake extension transformer_engine Running command /data/train_nfs/offload_megatron/megatron_0.8/zmt/Megatron-LM/TransformerEngine/.eggs/cmake-3.30.4-py3.8-linux-x86_64.egg/cmake/data/bin/cmake -S /data/train_nfs/offload_megatron/megatron_0.8/zmt/Megatron-LM/TransformerEngine/transformer_engine/common -B /data/train_nfs/offload_megatron/megatron_0.8/zmt/Megatron-LM/TransformerEngine/build/cmake -DPython_EXECUTABLE=/opt/conda/bin/python -DPython_INCLUDE_DIR=/opt/conda/include/python3.8 -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/data/train_nfs/offload_megatron/megatron_0.8/zmt/Megatron-LM/TransformerEngine/build/lib.linux-x86_64-cpython-38 -DCMAKE_CUDA_ARCHITECTURES=70;80;89;90 -Dpybind11_DIR=/opt/conda/lib/python3.8/site-packages/pybind11/share/cmake/pybind11 -GNinja CMake Error at /data/train_nfs/offload_megatron/megatron_0.8/zmt/Megatron-LM/TransformerEngine/.eggs/cmake-3.30.4-py3.8-linux-x86_64.egg/cmake/data/share/cmake-3.30/Modules/Internal/CMakeCUDAFindToolkit.cmake:104 (message): Failed to find nvcc.
Compiler requires the CUDA toolkit. Please set the CUDAToolkit_ROOT
variable.
Call Stack (most recent call first):
/data/train_nfs/offload_megatron/megatron_0.8/zmt/Megatron-LM/TransformerEngine/.eggs/cmake-3.30.4-py3.8-linux-x86_64.egg/cmake/data/share/cmake-3.30/Modules/CMakeDetermineCUDACompiler.cmake:85 (cmake_cuda_find_toolkit)
CMakeLists.txt:23 (project)
CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
-- Configuring incomplete, errors occurred!
Traceback (most recent call last):
File "/data/train_nfs/offload_megatron/megatron_0.8/zmt/Megatron-LM/TransformerEngine/build_tools/build_ext.py", line 89, in _build_cmake
subprocess.run(command, cwd=build_dir, check=True)
File "/opt/conda/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/data/train_nfs/offload_megatron/megatron_0.8/zmt/Megatron-LM/TransformerEngine/.eggs/cmake-3.30.4-py3.8-linux-x86_64.egg/cmake/data/bin/cmake', '-S', '/data/train_nfs/offload_megatron/megatron_0.8/zmt/Megatron-LM/TransformerEngine/transformer_engine/common', '-B', '/data/train_nfs/offload_megatron/megatron_0.8/zmt/Megatron-LM/TransformerEngine/build/cmake', '-DPython_EXECUTABLE=/opt/conda/bin/python', '-DPython_INCLUDE_DIR=/opt/conda/include/python3.8', '-DCMAKE_BUILD_TYPE=Release', '-DCMAKE_INSTALL_PREFIX=/data/train_nfs/offload_megatron/megatron_0.8/zmt/Megatron-LM/TransformerEngine/build/lib.linux-x86_64-cpython-38', '-DCMAKE_CUDA_ARCHITECTURES=70;80;89;90', '-Dpybind11_DIR=/opt/conda/lib/python3.8/site-packages/pybind11/share/cmake/pybind11', '-GNinja']' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/data/train_nfs/offload_megatron/megatron_0.8/zmt/Megatron-LM/TransformerEngine/setup.py", line 174, in <module>
setuptools.setup(
File "/opt/conda/lib/python3.8/site-packages/setuptools/__init__.py", line 87, in setup
return distutils.core.setup(**attrs)
File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup
return run_commands(dist)
File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
dist.run_commands()
File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
self.run_command(cmd)
File "/opt/conda/lib/python3.8/site-packages/setuptools/dist.py", line 1208, in run_command
super().run_command(command)
File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/opt/conda/lib/python3.8/site-packages/setuptools/command/install.py", line 68, in run
return orig.install.run(self)
File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/command/install.py", line 698, in run
self.run_command('build')
File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/opt/conda/lib/python3.8/site-packages/setuptools/dist.py", line 1208, in run_command
super().run_command(command)
File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/command/build.py", line 132, in run
self.run_command(cmd_name)
File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/opt/conda/lib/python3.8/site-packages/setuptools/dist.py", line 1208, in run_command
super().run_command(command)
File "/opt/conda/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/data/train_nfs/offload_megatron/megatron_0.8/zmt/Megatron-LM/TransformerEngine/build_tools/build_ext.py", line 119, in run
ext._build_cmake(
File "/data/train_nfs/offload_megatron/megatron_0.8/zmt/Megatron-LM/TransformerEngine/build_tools/build_ext.py", line 91, in _build_cmake
raise RuntimeError(f"Error when running CMake: {e}")
RuntimeError: Error when running CMake: Command '['/data/train_nfs/offload_megatron/megatron_0.8/zmt/Megatron-LM/TransformerEngine/.eggs/cmake-3.30.4-py3.8-linux-x86_64.egg/cmake/data/bin/cmake', '-S', '/data/train_nfs/offload_megatron/megatron_0.8/zmt/Megatron-LM/TransformerEngine/transformer_engine/common', '-B', '/data/train_nfs/offload_megatron/megatron_0.8/zmt/Megatron-LM/TransformerEngine/build/cmake', '-DPython_EXECUTABLE=/opt/conda/bin/python', '-DPython_INCLUDE_DIR=/opt/conda/include/python3.8', '-DCMAKE_BUILD_TYPE=Release', '-DCMAKE_INSTALL_PREFIX=/data/train_nfs/offload_megatron/megatron_0.8/zmt/Megatron-LM/TransformerEngine/build/lib.linux-x86_64-cpython-38', '-DCMAKE_CUDA_ARCHITECTURES=70;80;89;90', '-Dpybind11_DIR=/opt/conda/lib/python3.8/site-packages/pybind11/share/cmake/pybind11', '-GNinja']' returned non-zero exit status 1.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip. error: legacy-install-failure
× Encountered error while trying to install package. ╰─> transformer-engine
note: This is an issue with the package mentioned above, not pip. hint: See above for output from the failure.
thanks!!!
/usr/bin/c++
). Try setting CXX
in the environment to the path of your compiler. We usually build with GCC 11.4, but any reasonably modern C++ compiler should work.CUDA_PATH
in the environment or adding nvcc
to your PATH
. Note that CUDA 12.0 or newer is required.CUDNN_PATH
in your environment. Note that cuDNN 8.1 or newer is required.
make importerro: importlib.metadata.PackageNotFoundError: transformer-engine. have you met the same problems? thanks!