ROCm / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
http://pytorch.org
Other
219 stars 50 forks source link

Add support for memory efficient attention for AMD/ROCm #1390

Open Looong01 opened 2 months ago

Looong01 commented 2 months ago

🚀 The feature, motivation and pitch

Enable support for Flash Attention Memory Efficient and SDPA kernels for AMD GPUs.

At present using these gives below warning with latest nightlies (torch==2.4.0.dev20240413+rocm6.0, pytorch-triton-rocm 3.0.0+0a22a91d04):

/site-packages/diffusers/models/attention_processor.py:1117: UserWarning: 1Torch was not compiled with memory efficient attention. (Triggered internally at ../aten/src/ATen/native/transformers/hip/sdp_utils.cpp:505.)

Alternatives

Users cannot use the native PyTorch APIs with memory efficient attention.

Additional context

No response

Epliz commented 2 months ago

Hi,

Not sure what is the status, but looks like AMD has been working on it: https://github.com/pytorch/pytorch/pull/114309