VITA-Group / AGD

[ICML2020] "AutoGAN-Distiller: Searching to Compress Generative Adversarial Networks" by Yonggan Fu, Wuyang Chen, Haotao Wang, Haoran Li, Yingyan Lin, Zhangyang Wang
MIT License
104 stars 19 forks source link

A problem about network search. #15

Closed cszy98 closed 3 years ago

cszy98 commented 3 years ago

Hi, great work! The pre-trained ESRGAN_x4 is trained in a supervised way(the dataset has paired HR and LR images), so why do we use the pre-trained ESRGAN model as the teacher model instead of directly using ground truth as the supervision of the network searching process?

tilmto commented 3 years ago

Hi, it's just like the knowledge distillation in classification that although you have the one-hot label, you still enforce the model to learn from the teacher model. We find learning from a well-pretrained teacher to mimic its output distribution can achieve better results compared with directly using the ground truth.

cszy98 commented 3 years ago

Thank you for your reply.