ROCm / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
http://pytorch.org
Other
219 stars 51 forks source link

[ROCM] Properly disable Flash Attention/Efficient Attention with environment variables #1541

Closed xinyazhang closed 1 month ago

xinyazhang commented 1 month ago

Now USE_FLASH_ATTENTION=0 USE_MEM_EFF_ATTENTION=0 python setup.py can compile correctly

This is a backported version from https://github.com/pytorch/pytorch/pull/133866

Tested with USE_FLASH_ATTENTION=0 USE_MEM_EFF_ATTENTION=0 python setup.py develop --user and python -c 'import torch'

pruthvistony commented 1 month ago

The PR https://github.com/ROCm/pytorch/pull/1536 was merged. MEM_EFF_ATTENTION is always turned off, when it will be enabled?

pruthvistony commented 1 month ago

Not required.