Algolzw / daclip-uir

[ICLR 2024] Controlling Vision-Language Models for Universal Image Restoration. 5th place in the NTIRE 2024 Restore Any Image Model in the Wild Challenge.
https://algolzw.github.io/daclip-uir
MIT License
628 stars 29 forks source link

Efficiency of DACLIP-IR #68

Open Luciennnnnnn opened 1 month ago

Luciennnnnnn commented 1 month ago

Hi, interesting work! I have a few questions I would like to ask:

  1. What is the parameter count of the U-Net used in your model?
  2. Is this a pixel domain model or a latent domain model?
  3. Compared to methods based on Stable Diffusion, is your method more efficient (in terms of training and inference)?

Any help would be appreciated!

Algolzw commented 1 month ago

Hi thank you! 1) Our U-net uses 48.9M parameters and you can find the details in Table 8 in our paper. 2) Our method is in the pixel domain but you can definitely implement a latent version. 3) It's hard to compare with the Stable Diffusion since it can use various sampling acceleration methods.