Closed mitalbert closed 5 years ago
Ideally it should not matter. Maybe something changed between PyTorch versions. If you are using this repository, please use either v1.0 or v1.1.
Thanks for the catch. Actually, we never tested resume argument. We will fix it.
Sure, I'm using 1.1.0
@mitalbert I have incorporated your suggestions. Please check if it resolves your issues.
If you mean the first issue in train_eval_seg.py, then it looks exactly like my fix with worked for me, so, yes.
If you mean the issue with resuming the training, unfortunately I can't test it now.
Hi, thanks for publishing the code.
I noticed the optimizer.zero_grad() is called after calling optimizer.step() in train_eval_seg.py. Is this intentional? I was having issues until I moved zero_grad() above loss.backward().
Also when resuming the training there was a runtime error with type compatibility (cpu vs cuda) when calculating loss.backward() and adding model = model.cuda() right after loading the state dictionary in train_segmentation.py solved it.