Dao-AILab / flash-attention

Fast and memory-efficient exact attention
BSD 3-Clause "New" or "Revised" License
13.91k stars 1.29k forks source link

I can't excute "pip setup.py install" on windows 2022, vs 2019, pytorch 2.1.0.dev20230721+cu121. Error message is "python setup.py bdist_wheel did not run successfully...“. #435

Open justwangweimin opened 1 year ago

justwangweimin commented 1 year ago

Requirement already satisfied: torch in c:\programdata\anaconda3\lib\site-packages (from flash-attn==2.0.4) (2.1.0.dev20230721+cu121) Requirement already satisfied: einops in c:\programdata\anaconda3\lib\site-packages (from flash-attn==2.0.4) (0.6.1) Requirement already satisfied: packaging in c:\programdata\anaconda3\lib\site-packages (from flash-attn==2.0.4) (23.0) Requirement already satisfied: ninja in c:\programdata\anaconda3\lib\site-packages (from flash-attn==2.0.4) (1.11.1) Requirement already satisfied: filelock in c:\programdata\anaconda3\lib\site-packages (from torch->flash-attn==2.0.4) (3.12.2) Requirement already satisfied: typing-extensions in c:\programdata\anaconda3\lib\site-packages (from torch->flash-attn==2.0.4) (4.7.1) Requirement already satisfied: sympy in c:\programdata\anaconda3\lib\site-packages (from torch->flash-attn==2.0.4) (1.11.1) Requirement already satisfied: networkx in c:\programdata\anaconda3\lib\site-packages (from torch->flash-attn==2.0.4) (3.1) Requirement already satisfied: jinja2 in c:\programdata\anaconda3\lib\site-packages (from torch->flash-attn==2.0.4) (3.1.2) Requirement already satisfied: fsspec in c:\programdata\anaconda3\lib\site-packages (from torch->flash-attn==2.0.4) (2023.3.0) Requirement already satisfied: MarkupSafe>=2.0 in c:\programdata\anaconda3\lib\site-packages (from jinja2->torch->flash-attn==2.0.4) (2.1.1) Requirement already satisfied: mpmath>=0.19 in c:\programdata\anaconda3\lib\site-packages (from sympy->torch->flash-attn==2.0.4) (1.3.0) Building wheels for collected packages: flash-attn Building wheel for flash-attn (setup.py) ... error error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> [68 lines of output]

  torch.__version__  = 2.1.0.dev20230721+cu121

  running bdist_wheel
  running build
  running build_py
  creating build
  creating build\lib.win-amd64-cpython-311
  creating build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\bert_padding.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\flash_attn_interface.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\flash_attn_triton.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\flash_attn_triton_og.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\flash_blocksparse_attention.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\flash_blocksparse_attn_interface.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\fused_softmax.py -> build\lib.win-amd64-cpython-311\flash_attn
  copying flash_attn\__init__.py -> build\lib.win-amd64-cpython-311\flash_attn
  creating build\lib.win-amd64-cpython-311\flash_attn\layers
  copying flash_attn\layers\patch_embed.py -> build\lib.win-amd64-cpython-311\flash_attn\layers
  copying flash_attn\layers\rotary.py -> build\lib.win-amd64-cpython-311\flash_attn\layers
  copying flash_attn\layers\__init__.py -> build\lib.win-amd64-cpython-311\flash_attn\layers
  creating build\lib.win-amd64-cpython-311\flash_attn\losses
  copying flash_attn\losses\cross_entropy.py -> build\lib.win-amd64-cpython-311\flash_attn\losses
  copying flash_attn\losses\__init__.py -> build\lib.win-amd64-cpython-311\flash_attn\losses
  creating build\lib.win-amd64-cpython-311\flash_attn\models
  copying flash_attn\models\bert.py -> build\lib.win-amd64-cpython-311\flash_attn\models
  copying flash_attn\models\falcon.py -> build\lib.win-amd64-cpython-311\flash_attn\models
  copying flash_attn\models\gpt.py -> build\lib.win-amd64-cpython-311\flash_attn\models
  copying flash_attn\models\gptj.py -> build\lib.win-amd64-cpython-311\flash_attn\models
  copying flash_attn\models\gpt_neox.py -> build\lib.win-amd64-cpython-311\flash_attn\models
  copying flash_attn\models\llama.py -> build\lib.win-amd64-cpython-311\flash_attn\models
  copying flash_attn\models\opt.py -> build\lib.win-amd64-cpython-311\flash_attn\models
  copying flash_attn\models\vit.py -> build\lib.win-amd64-cpython-311\flash_attn\models
  copying flash_attn\models\__init__.py -> build\lib.win-amd64-cpython-311\flash_attn\models
  creating build\lib.win-amd64-cpython-311\flash_attn\modules
  copying flash_attn\modules\block.py -> build\lib.win-amd64-cpython-311\flash_attn\modules
  copying flash_attn\modules\embedding.py -> build\lib.win-amd64-cpython-311\flash_attn\modules
  copying flash_attn\modules\mha.py -> build\lib.win-amd64-cpython-311\flash_attn\modules
  copying flash_attn\modules\mlp.py -> build\lib.win-amd64-cpython-311\flash_attn\modules
  copying flash_attn\modules\__init__.py -> build\lib.win-amd64-cpython-311\flash_attn\modules
  creating build\lib.win-amd64-cpython-311\flash_attn\ops
  copying flash_attn\ops\activations.py -> build\lib.win-amd64-cpython-311\flash_attn\ops
  copying flash_attn\ops\fused_dense.py -> build\lib.win-amd64-cpython-311\flash_attn\ops
  copying flash_attn\ops\layer_norm.py -> build\lib.win-amd64-cpython-311\flash_attn\ops
  copying flash_attn\ops\rms_norm.py -> build\lib.win-amd64-cpython-311\flash_attn\ops
  copying flash_attn\ops\__init__.py -> build\lib.win-amd64-cpython-311\flash_attn\ops
  creating build\lib.win-amd64-cpython-311\flash_attn\utils
  copying flash_attn\utils\benchmark.py -> build\lib.win-amd64-cpython-311\flash_attn\utils
  copying flash_attn\utils\distributed.py -> build\lib.win-amd64-cpython-311\flash_attn\utils
  copying flash_attn\utils\generation.py -> build\lib.win-amd64-cpython-311\flash_attn\utils
  copying flash_attn\utils\pretrained.py -> build\lib.win-amd64-cpython-311\flash_attn\utils
  copying flash_attn\utils\__init__.py -> build\lib.win-amd64-cpython-311\flash_attn\utils
  running build_ext
  building 'flash_attn_2_cuda' extension
  creating D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311
  creating D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release
  creating D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc
  creating D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc\flash_attn
  creating D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc\flash_attn\src
  Emitting ninja build file D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\build.ninja...
  Compiling objects...
  Using envvar MAX_JOBS (8) as the number of workers...
  1.11.1.git.kitware.jobserver-1
  "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX86\x64\link.exe" /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\ProgramData\anaconda3\Lib\site-packages\torch\lib "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\lib\x64" /LIBPATH:C:\ProgramData\anaconda3\libs /LIBPATH:C:\ProgramData\anaconda3 /LIBPATH:C:\ProgramData\anaconda3\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\ATLMFC\lib\x64" "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.19041.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.19041.0\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib cudart.lib c10_cuda.lib torch_cuda.lib /EXPORT:PyInit_flash_attn_2_cuda D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/flash_api.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim128_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim160_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim160_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim192_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim192_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim224_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim224_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim256_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim256_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim32_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim32_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim64_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim64_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim96_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim96_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim128_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim128_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim160_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim160_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim192_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim192_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim224_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim224_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim256_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim256_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim32_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim32_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim64_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim64_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim96_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim96_fp16_sm80.obj /OUT:build\lib.win-amd64-cpython-311\flash_attn_2_cuda.cp311-win_amd64.pyd /IMPLIB:D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn\flash_attn_2_cuda.cp311-win_amd64.lib
  LINK : fatal error LNK1181: 无法打开输入文件“D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc\flash_attn\flash_api.obj”
  error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.29.30133\\bin\\HostX86\\x64\\link.exe' failed with exit code 1181
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for flash-attn Running setup.py clean for flash-attn Failed to build flash-attn ERROR: Could not build wheels for flash-attn, which is required to install pyproject.toml-based projects

justwangweimin commented 1 year ago

I found that there is flash_api.cpp in csrc/flash_attn, but there is no flash_api.obj. so the link.exe can't be executed. how to generate the obj file such as flash_api.obj, flash_bwd_hdim128_bf16_sm80.obj etc. why these cpp files can't be compiled to obj files? I have installed the cmake, vs 2019, setuptools . D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/flash_api.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim128_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim160_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim160_fp16_sm80.obj

justwangweimin commented 1 year ago

here is link command from bug log: "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX86\x64\link.exe" /EXPORT:PyInit_flash_attn_2_cuda D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/flash_api.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim128_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim160_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim160_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim192_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim192_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim224_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim224_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim256_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim256_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim32_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim32_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim64_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim64_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim96_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim96_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim128_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim128_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim160_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim160_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim192_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim192_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim224_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim224_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim256_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim256_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim32_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim32_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim64_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim64_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim96_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_fwd_hdim96_fp16_sm80.obj /OUT:build\lib.win-amd64-cpython-311\flash_attn_2_cuda.cp311-win_amd64.pyd /IMPLIB:D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn\flash_attn_2_cuda.cp311-win_amd64.lib

justwangweimin commented 1 year ago

@Tri Dao

ssswill commented 1 year ago

also have question

tridao commented 1 year ago

I don't have experience on Windows. Cutlass 3.2 is supposed to work on Windows, but maybe we need to do more work on the FlashAttention side to enable Windows support. I don't have bandwidth now to investigate this, lmk if you figure out something.

JimWang151 commented 2 months ago

@tri Dao

兄弟,最后你这个问题是怎么解决的?

JimWang151 commented 2 months ago

I found that there is flash_api.cpp in csrc/flash_attn, but there is no flash_api.obj. so the link.exe can't be executed. how to generate the obj file such as flash_api.obj, flash_bwd_hdim128_bf16_sm80.obj etc. why these cpp files can't be compiled to obj files? I have installed the cmake, vs 2019, setuptools . D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/flash_api.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim128_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim128_fp16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim160_bf16_sm80.obj D:\chatglm2-6b\flash-attention\build\temp.win-amd64-cpython-311\Release\csrc/flash_attn/src/flash_bwd_hdim160_fp16_sm80.obj

我的问题与你一模一样。

ssswill commented 2 months ago

回复不到,简单说下。去年flash_att不支持win,只能在linux上安装,现在不知道啥情况

shirubei commented 1 month ago

Same problem.