Closed baraujo98 closed 3 years ago
Yeah. The naming might need to be changed a bit. I think you need to use "model" instead of "classy_state_dict".
Ok, understood @zaiweizhang . Should I change anything in the following if
statements?
I guess I should just do something along this lines:
classy_state_dict = state_dict["model"]
state_dict = {}
state_dict.update(classy_state_dict)
Yeah. You probably need to look up variable names in the checkpoint and change some string names in that function. It should be a trivial task.
Got it. I solved the problem by:
state['model']
instead of state
, as the 2nd argument of the function, in train.py
I noticed you defined a freeze_bb
argument. Did you do any freezing in your fintuning tests? Seem like a good idea, at least for the first epochs.
I usually do not use that flag. I only freeze the weights for ModelNet shape classification. You should try to finetune on all weights first. I find freezing the weights sometimes causes performance decreases.
Ok, that's an interesting finding: a bit counter-intuitive, I would say. Did you freeze the backbone only on the first epochs, in your tests?
No. I did not freeze any weight. I load the pretrained weight and then finetune on all weights but I did increase the learning rate two times higher.
Ok, thanks, but when when (if) you tried freezing the backbone, was it frozen during the whole train, or only during the first epochs? So I know if its worth trying to just freeze in the first epochs.
I was freezing it during the whole train. So it's probably worth it trying to just freeze in the first epochs.
Ok, will try! Closing the issue for now. Thank you very much @zaiweizhang for the help :grin:
Should this be enough to freeze the backbone?
model.backbone_3d.requires_grad_(requires_grad=False)
Or should I do anything more sophisticated like filtering the layers to send to the optimizer, or use with torch.no_grad()
.
And the opposite to unfreeze, once the first epochs are done:
model.backbone_3d.requires_grad_(requires_grad=True)
I tried, and it looks like it might have worked.
Yep. That should work.
Hi! I tried to load a checkpoint from the pretrained backbones to PointRCNN. In this case, I tried to pick up from epoch 50 of the pretraining, just to test.
Here is the command:
Here is the error:
I don't totally understand some operation in the beggining of
init_model_from_weights()
. The checkpoints I got from the pretraining only have this keys:dict_keys(['epoch', 'model', 'optimizer', 'train_criterion'])
, they don't have a "classy_state_dict" or a "base_model" key, like I think you expect.Thanks!