VITA-Group / AGD

[ICML2020] "AutoGAN-Distiller: Searching to Compress Generative Adversarial Networks" by Yonggan Fu, Wuyang Chen, Haotao Wang, Haoran Li, Yingyan Lin, Zhangyang Wang
MIT License
104 stars 19 forks source link

Model files #12

Closed JuanDavidG1997 closed 3 years ago

JuanDavidG1997 commented 4 years ago

Hi! I have a script for training a GAN, but I don't understand what to safe in the weights.pt and arch.pt. Could you give me the pytorch lines to save each file correctly? Thanks!

tilmto commented 4 years ago

To save arch.pt: https://github.com/VITA-Group/AGD/blob/616a10a583d1d2e280585d083fd17bc555f0f93a/AGD_ST/search/train_search.py#L225 To save weight.pt: https://github.com/VITA-Group/AGD/blob/616a10a583d1d2e280585d083fd17bc555f0f93a/AGD_ST/search/train_search.py#L182

JuanDavidG1997 commented 4 years ago

So if I understand correctly, weigths stores almost everything in the model and arch stores some specific attirbutes and constants. Am I wright?

tilmto commented 4 years ago

yes

JuanDavidG1997 commented 4 years ago

Thank you!

JuanDavidG1997 commented 4 years ago

Sorry to bother once again, but I don't understand these valid_fid and flops items...

tilmto commented 4 years ago

valid_fid is the fid score of the saved model and flops (floating-point operations) is the computational cost of the saved model. If you do not care about the two information, you can directly delete it without saving it.

JuanDavidG1997 commented 4 years ago

So if I don't add these fields in the arch.pt there would be no problem?

tilmto commented 4 years ago

Yes.

JuanDavidG1997 commented 4 years ago

I am getting a ModuleAttributeError when trying to getattr(model.module, 'alpha'). I'm not sure what might be the reason for this... The model is define as usual in pytorch

tilmto commented 4 years ago

Did you use nn.DataParallel for searching?

JuanDavidG1997 commented 4 years ago

I don't think so... Thb I am very new at pytorch so i used a repo to train the GAN, but i don't see any DataParallel there sorry to bother

tilmto commented 4 years ago

How did you integrate our code with the pixel2pixel code to conduct the search? Or you search with our method first and then train the derived network from scratch with the pixel2pixel's training script? You can try getattr(model, 'alpha') if you didn't use nn.DataParallel.

JuanDavidG1997 commented 4 years ago

I understoon i could train first with pixel2pixel and then compress with AGD, am i wrong? I am open to suggestions

tilmto commented 4 years ago

Yes you can use a pretrained model as the teacher model when searching, and you need to modify the teacher model setting in train_search.py.

JuanDavidG1997 commented 4 years ago

Yes! So i can use just model without model.module?

JuanDavidG1997 commented 4 years ago

This is the process you explain in the readme right?

tilmto commented 4 years ago

Yes

JuanDavidG1997 commented 4 years ago

i tried using just getattr(model, 'alpha') and didn't work, i got the same error. Is it possible that using a custom defined class has something to do with it? Thank you

JuanDavidG1997 commented 3 years ago

Is there any way to export the addequate .pt files after loading a .pkl file? I mean, like writing a script that load a .pkl file and exports de .pt file AGD requieres tu work?

tilmto commented 3 years ago

Yes, you can load a .pkl file, reshape it in a dictionary, and save in a .pt file.

JuanDavidG1997 commented 3 years ago

But what aboout the arch.pt file? How can a reshape it?

tilmto commented 3 years ago

Just like common .pt files, you can load it and set one of the elements to a large value, e.g. [0.1, 0.4, 100, 0.2] indicates the 3rd operation will be the final choice.