QwenLM / Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Apache License 2.0
12.47k stars 1.01k forks source link

pip install csrc/layer_norm 不成功 #1208

Closed niykx closed 1 month ago

niykx commented 2 months ago

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

当前行为 | Current Behavior

Windows 11系统下,CUDA 12.1+Python3.11+torch2.2.2,只能安装flash-attention和csrc/rotary,不能安装csrc/layer_norm 将flash-attention切换到1.0.9也一样,layer_norm无法安装。换了两台电脑,结果也都一样。 显卡分别是2080TI 22G、RTX 3090,RTX 4090

期望行为 | Expected Behavior

期望能安装成功

复现方法 | Steps To Reproduce

git clone https://github.com/Dao-AILab/flash-attention cd flash-attention pip install csrc/layer_norm

运行环境 | Environment

- OS:Windows11
- Python:3.11.9
- Transformers:4.32.0
- PyTorch:2.2.2
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):12.1

备注 | Anything else?

[23/57] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\bin\nvcc --generate-dependencies-with-compile --dependency-output D:\AIGC\flash-attention\csrc\layer_norm\build\temp.win-amd64-cpython-311\Release\ln_fwd_2560.obj.d -std=c++17 --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /wd4624 -Xcompiler /wd4067 -Xcompiler /wd4068 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -ID:\AIGC\flash-attention\csrc\layer_norm -ID:\AIGC\Qwen\venv\Lib\site-packages\torch\include -ID:\AIGC\Qwen\venv\Lib\site-packages\torch\include\torch\csrc\api\include -ID:\AIGC\Qwen\venv\Lib\site-packages\torch\include\TH -ID:\AIGC\Qwen\venv\Lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include" -ID:\AIGC\Qwen\venv\include -IC:\Users\likex\AppData\Local\Programs\Python\Python311\include -IC:\Users\likex\AppData\Local\Programs\Python\Python311\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.39.33519\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.39.33519\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.39.33519\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.39.33519\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\Include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.39.33519\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.39.33519\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.39.33519\include" -c D:\AIGC\flash-attention\csrc\layer_norm\ln_fwd_2560.cu -o D:\AIGC\flash-attention\csrc\layer_norm\build\temp.win-amd64-cpython-311\Release\ln_fwd_2560.obj -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -O3 -UCUDA_NO_HALF_OPERATORS -UCUDA_NO_HALF_CONVERSIONS -UCUDA_NO_BFLOAT16_OPERATORS -UCUDA_NO_BFLOAT16_CONVERSIONS -UCUDA_NO_BFLOAT162_OPERATORS -UCUDA_NO_BFLOAT162_CONVERSIONS --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math -arch=sm_89 "-gencode=arch=compute_89,code=sm_89 " -gencode=arch=compute_89,code=compute_89 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=dropout_layer_norm -D_GLIBCXX_USE_CXX11_ABI=0 ln_fwd_2560.cu cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF_OPERATORS”(用“/UCUDA_NO_HALF_OPERATORS”) cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF_CONVERSIONS”(用“/UCUDA_NO_HALF_CONVERSIONS”) cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_BFLOAT16_CONVERSIONS”(用“/UCUDA_NO_BFLOAT16_CONVERSIONS”) ln_fwd_2560.cu cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF_OPERATORS”(用“/UCUDA_NO_HALF_OPERATORS”) cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_HALF_CONVERSIONS”(用“/UCUDA_NO_HALF_CONVERSIONS”) cl: 命令行 warning D9025 :正在重写“/DCUDA_NO_BFLOAT16_CONVERSIONS”(用“/UCUDA_NO_BFLOAT16_CONVERSIONS”)

  ln_fwd_2560.cu

  tmpxft_00002ab4_00000000-7_ln_fwd_2560.cudafe1.cpp

  ninja: build stopped: subcommand failed.
  Traceback (most recent call last):
    File "D:\AIGC\Qwen\venv\Lib\site-packages\torch\utils\cpp_extension.py", line 2096, in _run_ninja_build
      subprocess.run(
    File "C:\Users\likex\AppData\Local\Programs\Python\Python311\Lib\subprocess.py", line 571, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

  The above exception was the direct cause of the following exception:

  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "D:\AIGC\flash-attention\csrc\layer_norm\setup.py", line 195, in <module>
      setup(
    File "D:\AIGC\Qwen\venv\Lib\site-packages\setuptools\__init__.py", line 104, in setup
      return distutils.core.setup(**attrs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "D:\AIGC\Qwen\venv\Lib\site-packages\setuptools\_distutils\core.py", line 184, in setup
jklj077 commented 2 months ago

First, you don't need flash-attention to run the model. Second, flash-attention does not officially support Windows. If you wish to compile on Windows, you need to use newer versions of flash-attention and modify the source code. See https://github.com/Dao-AILab/flash-attention/issues/595 for reference and please kindly report your issues there.

github-actions[bot] commented 1 month ago

This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread. 此问题由于长期未有新进展而被系统自动标记为不活跃。如果您认为它仍有待解决,请在此帖下方留言以补充信息。