Closed yuanzhi-zhu closed 5 months ago
Thanks for your reply, I missed it. Where so you set the fake score model to train mode by the way? did not find it too 👀
The fake score model is a submodule of the guidance model. So I assume set the model to train mode will also set the fake model to train mode
l see, so the teacher model is also set to train mode during training, thanks a lot!
@tianweiy While the pre-trained teacher model is in train() mode, it has p=0 for all dropout layers and no BatchNorm layer, so setting self.real_unet.requiresgrad(False) is enough :p
@tianweiy So all the student models are trained with dropout=0?
yes, all student models are trained without dropout. This is explicitly specified in ImageNet case https://github.com/tianweiy/DMD2/blob/604040ab7b5ed4bd7d191e9476b908474ee7b24b/main/edm/edm_network.py#L11
Therefore my impression is that train or eval mode doesn't matter too much.
yes, all student models are trained without dropout. This is explicitly specified in ImageNet case https://github.com/tianweiy/DMD2/blob/604040ab7b5ed4bd7d191e9476b908474ee7b24b/main/edm/edm_network.py#L11
Therefore my impression is that train or eval mode doesn't matter too much.
Thanks 😃. I think dropout=0 also holds for SD cases in your expr. I checked the loaded SD model, which has dropout=0 in the dropout layers after loaded.
thanks for the great work, do you in somewhere set the model to model.train()? The unet loaded from diffusers is set to eval by default.