cc-ai / climategan

Code and pre-trained model for the algorithm generating visualisations of 3 climate change related events: floods, wildfires and smog.
https://thisclimatedoesnotexist.com
GNU General Public License v3.0
72 stars 18 forks source link

saving intermediate models when training #166

Closed tianyu-z closed 3 years ago

vict0rsch commented 3 years ago

@tianyu-z I simplified the logic. If you agree with my changes let's merge :)

(I think that if we can limit parameters and keep a simple logic, we should do so)

tianyu-z commented 3 years ago

@vict0rsch Yeah, I totally agree with your commit. In order to better load/resume an intermediate model, I added an option in defaults.yaml to load a model by the exact path to the pth model.

tianyu-z commented 3 years ago

Besides, according to recent training exps that I have done, I changed the saving params to:

save_n_epochs: 2 # Save `latest_ckpt.pth` every epoch, `epoch_{epoch}_ckpt.pth` model every n epochs if epoch >= min_save_epoch
min_save_epoch: 28 # Save extra intermediate checkpoints when epoch > min_save_epoch
vict0rsch commented 3 years ago

@tianyu-z I re-implemented the logic. I hope you can agree with me that it is more versatile and robust to errors (especially people using the wrong arguments, this WILL happen)

I created a doc comments in defaults.yaml:

README on load_path
1/ any path which leads to a dir will be loaded as `path / checkpoints / latest_ckpt.pth`
2/ if you want to specify a specific checkpoint, it MUST be a `.pth` file
3/ resuming a P OR an M model, you may only specify 1 of `load_path.p` OR `load_path.m`.
   You may also leave BOTH at none, in which case `output_path / checkpoints / latest_ckpt.pth`
   will be used
4/ resuming a P+M model, you may specify (`p` AND `m`) OR `pm` OR leave all at none,
   in which case `output_path / checkpoints / latest_ckpt.pth` will be used to load from
   a single checkpoint
vict0rsch commented 3 years ago

@tianyu-z do you think the code now handles everything we want to cover? Can you see any loophole in the logic?

tianyu-z commented 3 years ago

@tianyu-z do you think the code now handles everything we want to cover? Can you see any loophole in the logic?

Thanks a lot! I am checking now.

tianyu-z commented 3 years ago

@vict0rsch I don't see any holes in your logic. It's pretty strong. :fire: Sorry, I was not aware that there are other things related to the self.output_path.

vict0rsch commented 3 years ago

No worries but just look for self.output_path next time to make sure it's safe to not have it point to a directory. And you'll see it's used all over the place