ROCm / flash-attention

Fast and memory-efficient exact attention
BSD 3-Clause "New" or "Revised" License
142 stars 46 forks source link

[Issue]: memory format option is only supported by strided tensors #64

Closed hyattpd closed 1 week ago

hyattpd commented 5 months ago

Problem Description

The benchmarking script exits with an error:

RuntimeError: memory format option is only supported by strided tensors

This is with stable pytorch 2.3 and rocm 6.0.

Issue is described here as well: https://github.com/Dao-AILab/flash-attention/issues/782 and hints that it might be a problem in pytorch 2.3+, but no one seems to have posted a solution.

Operating System

SLES 15-SP4

CPU

AMD EPYC 7763 64-Core Processor

GPU

AMD Instinct MI210

ROCm Version

ROCm 6.0.0

ROCm Component

No response

Steps to Reproduce

Run the benchmark_flash_attention.py script.

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

ppanchad-amd commented 3 weeks ago

Hi @hyattpd. Internal ticket has been created to investigate your issue. Thanks!

jamesxu2 commented 2 weeks ago

Hi @hyattpd - I attempted to reproduce this with the pytorch:latest docker image (pytorch 2.3.0) and ROCm 6.2.3 but I don't see that runtime error, and the benchmarking script in flash-attention/benchmarks/benchmark_flash_attention.py exits cleanly. Also, someone in the linked issue apparently ran into the same error with NVIDIA A100s which suggests this is probably not a ROCm issue.

If you don't want to upgrade ROCm locally, you can retry this in our pytorch:latest docker image and see if you still encounter this issue. Let us know the results!

jamesxu2 commented 1 week ago

Closing due to inactivity. Please feel free to reopen this issue or submit a new one if you need more help!