shiml20 / FlowTurbo

Official PyTorch Implementation of "FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner"
https://flowturbo.ivg-research.xyz/
MIT License
59 stars 3 forks source link

Regarding reproduction for given quantitative results on ImageNet 256x256 #4

Closed dlsrbgg33 closed 3 weeks ago

dlsrbgg33 commented 3 weeks ago

Hi, thank you for your great work and sharing the code and checkpoint.

I have two questions regarding the quantitative scores presented in the paper:

  1. I attempted to reproduce the scores for H1P5R3 and H8P9R5 from Table 2. I was able to achieve a similar FID score for H1P5R3 using the provided checkpoint. However, when testing H8P9R5, I observed a discrepancy between my evaluation score (2.33) and the reported score in the paper (2.12). To further investigate, I switched from fp16 to fp32, obtaining an FID of 2.27, which still differs from the original score. I also by myself trained the model for 30k iterations but ended up with an FID of 2.29. Could you kindly provide any insights into why my FID score for H8P9R5 is higher? I wonder if the 2.12 FID score might correspond to H7P10R4, as indicated in Table 4?

  2. Additionally, could you offer some insights into why the FlowTurbo model outperforms SiT? My understanding is that the velocity refiner predicts the offset for real velocity from the SiT output, which leads me to believe that SiT should theoretically serve as an upper bound for FlowTurbo's performance.

I hope you can give me some insights on above questions. Thank you in advance.

shiml20 commented 3 weeks ago

Hi, thank you for your great work and sharing the code and checkpoint.

I have two questions regarding the quantitative scores presented in the paper:

  1. I attempted to reproduce the scores for H1P5R3 and H8P9R5 from Table 2. I was able to achieve a similar FID score for H1P5R3 using the provided checkpoint. However, when testing H8P9R5, I observed a discrepancy between my evaluation score (2.33) and the reported score in the paper (2.12). To further investigate, I switched from fp16 to fp32, obtaining an FID of 2.27, which still differs from the original score. I also by myself trained the model for 30k iterations but ended up with an FID of 2.29. Could you kindly provide any insights into why my FID score for H8P9R5 is higher? I wonder if the 2.12 FID score might correspond to H7P10R4, as indicated in Table 4?
  2. Additionally, could you offer some insights into why the FlowTurbo model outperforms SiT? My understanding is that the velocity refiner predicts the offset for real velocity from the SiT output, which leads me to believe that SiT should theoretically serve as an upper bound for FlowTurbo's performance.

I hope you can give me some insights on above questions. Thank you in advance.

Thank you very much for your interest in our work. Regarding the first question: The differences between your experimental results and ours can be attributed to the following reasons:

Regarding the second question: Our insights suggest that flow-based models like SiT may exhibit information redundancy, particularly in model weights. This observation motivated us to design a lightweight refiner to regress the offset. Consequently, the trajectories generated by the model may differ when using the refiner compared to the original model. Moreover, the original SiT model requires significantly more time to achieve similar generative quality. Our approach reduces this computational overhead, highlighting the information redundancy inherent in flow-based models.