Open youngwanLEE opened 2 months ago
Thanks. Regarding your question about applying the proposed method to FLUX, it sounds like a challenging yet exciting opportunity. The differences in structure with MMDiT could certainly introduce complexities, but it might also offer new avenues for optimization.
Recently, we are exploring how to improve the method of DiTFastAttn. And I believe it will achieve a significant acceleration on FLUX and SD3. It will release with our next paper.
Hello @hahnyuan @youngwanLEE
I am Jiarui Fang from the xDiT project (https://github.com/xdit-project/xDiT), and I took notice of the DiTFastAttn work for the first time its released. I found that its motivation, redundancy in diffusion models, is similar to that of PipeFusion. While PipeFusion leverages this redundancy to address parallel performance issues, DiTFastAttn use it to reduce computation on a single GPU.
We have also implemented DiTFastAttn in xDiT and cited your work. We are also exploring the use cases of DiTFastAttn in Flux and SD3. I am not sure if there is an opportunity for collaboration.
Hello @feifeibear I've recently been looking into what FLUX and SD3 can do, and I see some potential for enhancements in the CrossAttention between text and images in MMDiT. I'm looking forward to the opportunity to collaborate with you on this!
Hi, I'm impressed by your work.
I wonder whether the proposed method can be applied to FLUX, the more recent SoTA T2I model, which has more complex transformer blocks.