thu-nics / DiTFastAttn

MIT License
104 stars 8 forks source link

generality to FLUX model #8

Open youngwanLEE opened 2 months ago

youngwanLEE commented 2 months ago

Hi, I'm impressed by your work.

I wonder whether the proposed method can be applied to FLUX, the more recent SoTA T2I model, which has more complex transformer blocks.

hahnyuan commented 2 months ago

Thanks. Regarding your question about applying the proposed method to FLUX, it sounds like a challenging yet exciting opportunity. The differences in structure with MMDiT could certainly introduce complexities, but it might also offer new avenues for optimization.

Recently, we are exploring how to improve the method of DiTFastAttn. And I believe it will achieve a significant acceleration on FLUX and SD3. It will release with our next paper.

feifeibear commented 1 month ago

Hello @hahnyuan @youngwanLEE

I am Jiarui Fang from the xDiT project (https://github.com/xdit-project/xDiT), and I took notice of the DiTFastAttn work for the first time its released. I found that its motivation, redundancy in diffusion models, is similar to that of PipeFusion. While PipeFusion leverages this redundancy to address parallel performance issues, DiTFastAttn use it to reduce computation on a single GPU.

We have also implemented DiTFastAttn in xDiT and cited your work. We are also exploring the use cases of DiTFastAttn in Flux and SD3. I am not sure if there is an opportunity for collaboration.

hahnyuan commented 1 month ago

Hello @feifeibear I've recently been looking into what FLUX and SD3 can do, and I see some potential for enhancements in the CrossAttention between text and images in MMDiT. I'm looking forward to the opportunity to collaborate with you on this!