[release/2.4] [ROCM] Properly disable Flash Attention/Efficient Attention with environment variables

ROCm / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

http://pytorch.org

Other

219 stars 51 forks source link

Closed xinyazhang closed 3 weeks ago

xinyazhang commented 1 month ago

Now USE_FLASH_ATTENTION=0 USE_MEM_EFF_ATTENTION=0 python setup.py can compile correctly.