victoresque / pytorch-template

PyTorch deep learning projects made easy.
MIT License
4.75k stars 1.09k forks source link

Add combined command line options for fine-tuning #27

Closed SunQpark closed 6 years ago

SunQpark commented 6 years ago

This PR is about issue #25, adding command line options for fine tuning. I have rearranged the order of config file being set up in train.py and added warning for sensitive options(architecture, optimizer) being changed.

If the config and resume options are given at the same time, instances like model, optimizer are initialized in the train.py according to the config file given, than base_trainer handles loading checkpoint for that.

It seems that changing options related to monitor is also problematic when changed, but didn't handled that yet.

borgesa commented 6 years ago

Hi,

Did not get time to review, as you have already merged. Code looks good (have not run it).

I did have one comment:

Individually these are good, but the latter makes more sense, if one additional option is possible in 'train.py': resuming training with a new configuration file. Currently, I expect both checks* to always be true.

Maybe the option I mention should be added later. Do you agree, @SunQpark?

*:

if self.config['arch'] != checkpoint['arch']: [...]
[...]
if checkpoint['config']['optimizer']['type'] == self.config['optimizer']['type']: [...]
SunQpark commented 6 years ago

Can you explain more about this? I did not get what is the option you are talking about. 'resuming training with a new configuration file' is already implemented as using -c and -r together

borgesa commented 6 years ago

As I said, I did not run it earlier. Did it now and the code indeed does what you say it does.

The confusion for me is in the '__main__' part of 'train.py': When reading the code in '__main__' it looks like config and resume are mutually exclusive, since it checks: "if args.config, elif args.resume".

In a way, the "continue and resume" functionality is currently a bit hidden. Spelling it out explicitly as a comment after "elif args.resume" would be sufficient (e.g., "# Load configuration file from resumed training instance, if no new configuration file is given").

PS: Now that timestamps are added to training, the force option seems a little unnecessary to me. In particular in the case I just tested, using both "--resume" and "--config".

SunQpark commented 6 years ago

I agree, on both about train.py and about --force option. I'll update that as a small commit on the PR #28 Thank you.