Closed TheJKM closed 11 months ago
Thank you for your interest in our work. Due to the dependency on flash attention 2 the current version does not support RDNA3 and RDNA2 architectures.
However, we are developing a version that does not utilize flash attention 2. If this alternative version interests you, please reach out to me via email.
According to https://github.com/ROCmSoftwarePlatform/flash-attention/blob/flash_attention_for_rocm/setup.py#L215. The supported arch are CDNA2(MI200) and CDNA3(MI300).
Thank you for your quick answer! My primary interest is in benchmarking, so I'm happy to wait until the other version is available.
Hi, awesome work! I have a question about supported GPU architectures, and I couldn't find anything about it in the repo. All your tests seem to be done on the Mi210, which is a CDNA2 card. Does your vLLM ROCm port also work on different architectures, like RDNA 3 and RDNA 2, which are now supported by ROCm 5.7?