nmi-lab / decolle-public

GNU General Public License v3.0
41 stars 22 forks source link

resume training fails with single learning rate #17

Open weinman opened 2 years ago

weinman commented 2 years ago

Quick report from the field on commit 61aca54 to branch update_21014 (which addressed #15 and #16).

Because decolle.utils.MultiOpt uses the semantically accurate, but non-conforming method name load_state_dicts, resuming model training fails with train_lenet_decolle.py when trying to resume from a single learning-rate model whose opt is a torch .optim.Adamax object (which has only a load_state_dict method), the failure happening here.

I'd perhaps suggest simply renaming the MultiOpt method to the singular load_state_dict to avoid messiness elsewhere about checking for which attribute or object class is present.

I'm happy to submit a PR along those lines if you like; it looks like you pull the public facing version of this repo from elsewhere, so I'd understand if that complicates your git flow for so simple a change.

Here's the error trace:

Traceback (most recent call last): File "train_lenet_decolle.py", line 111, in starting_epoch = load_model_from_checkpoint(checkpoint_dir, net, opt) File "[root]/conda/lib/python3.7/site-packages/decolle-0.1-py3.7.egg/decolle/utils.py", line 166, in load_model_from_checkpoint AttributeError: 'Adamax' object has no attribute 'load_state_dicts'