Closed qrzou closed 4 years ago
Hi, @qrzou A quick solution to address this is commenting the lines starting from https://github.com/jwyang/graph-rcnn.pytorch/blob/d7ca37d1ac8825aa0950a92d063221a1a7042c16/lib/scene_parser/rcnn/utils/checkpoint.py#L67, which I have done in the newest commit.
Thank you @jwyang ! but I found that there's no difference between the newest commit and original code. Should I disable loading optimizer and scheduler while stepwise training? And the object detector checkpoint I trained didn't include model's state_dict, only got optimizer, scheduler and iteration, which confused me a lot.
@qrzou , in the newest commit, it should be already updated. See here: https://github.com/jwyang/graph-rcnn.pytorch/blob/b3d6c4f01eb8e7566c28a3dd6a6f8fbc3b7f665f/lib/scene_parser/rcnn/utils/checkpoint.py#L67
To check whether you have successfully obtained the object detector checkpoint, can you try to evaluate the object detection performance first? Also, you can download the checkpoint I shared in the README and try it out for sanity check.
@jwyang , the loaded error was caused by the argparser https://github.com/jwyang/graph-rcnn.pytorch/blob/d7ca37d1ac8825aa0950a92d063221a1a7042c16/main.py#L95.
When using this command in readme python main.py --config-file configs/faster_rcnn_res101.yaml
, cfg.MODEL.ALGORITHM will be set to default value "sg_baseline", though faster_rcnn_res101.yaml file set the ALGORITHM to faster_rcnn. It leads the faster_rcnn ckpt path becoming "sg_baseline_joint0" not "faster_rcnn", which let the checkpointer load the optimizer and scheduler in stepwise training.
A quick solution is setting the algorithm explicitly in object detector training:
python main.py --config-file configs/faster_rcnn_res101.yaml --algorithm faster_rcnn
.
I've pre-trained the object detector using this command:
python main.py --config-file configs/faster_rcnn_res101.yaml
and I've modified the path param WEIGHT_DET in sgg_res101_step.yaml fileHowever when I trained the model stepwise using the command
python main.py --config-file configs/sgg_res101_step.yaml --algorithm $ALGORITHM
, the errorValueError: loaded state dict has a different number of parameter groups
occurred in the optimizer's load_state_dict function.I'm wondering that if this pipeline support loading a pre-trained object detector in stepwise training.Thanks!