facebookresearch / maskrcnn-benchmark

Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.
MIT License
9.31k stars 2.49k forks source link

Is it necessary to delete these parameters to fine-tune model when the number of classes changed? #1354

Open 4daJKong opened 1 year ago

4daJKong commented 1 year ago

❓ Questions and Help

I have a layoutlmv3+generalizedRCNN pre-trained model for 5-classes classification task. Now I want to use this model for 10-classes classification task. The dataset is still coco format. I only rewrited MODEL.ROI_HEADS.NUM_CLASSES: 10 in my yaml.file.

Like many issues I found, it shows several warning in loading parameters, like,

WARNING [02/23 08:56:32 fvcore.common.checkpoint]: Skip loading parameter 'roi_heads.box_predictor.0.cls_score.weight' to the model due to incompatible shapes: (6, 1024) in the checkpoint but (11, 1024) in the model! You might want to double check if this is expected.
WARNING [02/23 08:56:32 fvcore.common.checkpoint]: Skip loading parameter 'roi_heads.box_predictor.0.cls_score.bias' to the model due to incompatible shapes: (6,) in the checkpoint but (11,) in the model! You might want to double check if this is expected.

In a few word, as he said,

WARNING [02/23 08:56:32 fvcore.common.checkpoint]: Some model parameters or buffers are not found in the checkpoint:
roi_heads.box_predictor.0.cls_score.{bias, weight}
roi_heads.box_predictor.1.cls_score.{bias, weight}
roi_heads.box_predictor.2.cls_score.{bias, weight}
roi_heads.mask_head.predictor.{bias, weight}

However, what is difference is it didn't stop by this problem and skip it to train, it is still training now but it will take some time to get the final result.

Thus, I tried to follow #15 #430 #1153 and other issues to delete those parameters by

import torch
model = torch.load(model_path, map_location=torch.device('cpu'))
#Second stage prediction
del model["model"]["roi_heads.box_predictor.0.cls_score.weight"]
del model["model"]["roi_heads.box_predictor.0.cls_score.bias"]
#... from box_predictor.0 to 2, I have deleted all of para that warning in response  
#mask prediction
del model["model"]["roi_heads.mask_head.predictor.weight"]
del model["model"]["roi_heads.mask_head.predictor.bias"]
#save the model
torch.save(model, "/data/.../model_modified.pth")

The problem is if I import this modified model as before, by

    args.config_file = "/home/.../my.yaml"
    args.opts = [
    'MODEL.WEIGHTS', 
    '/data/.../model_modified.pth',
    ]
    cfg = get_cfg()
    add_vit_config(cfg)
    cfg.merge_from_file(args.config_file) 
    cfg.merge_from_list(args.opts) 

It shows a new error

Exception has occurred: OSError
Couldn't reach server at '/data/.../model_modified.pth' to download configuration file or configuration file is not a valid JSON file. Please check network or file content here: /data/.../model_modified.pth.

That is very strange, this is the first time I met this problem and it is still unsolved so my question is as I said in the title, did it influence my final result if I didn't change the parameters? I was wondering if these parameters will automatically change the shape from (6, 1024) to (11, 1024) and initalized weight and bias and retrain in new model. Or the model ignore these layers and add new-shaped layers (11, 1024) to replace them respectively? Or other methods? I am still a novice in ANN and not sure whether these warnings will affect the final result and accuracy.

Record after training new model When I checking the structure of new model after three days training, I found that some layers disappeared,

backbone.bottom_up.backbone.encoder.fpn1.1.running_var
backbone.bottom_up.backbone.encoder.fpn1.1.num_batches_tracked
backbone.bottom_up.backbone.encoder.fpn1.3.{weight,bias} 
backbone.bottom_up.backbone.encoder.fpn2.0.{weight,bias} 
backbone.bottom_up.backbone.patch_embed.proj.{weight, bias}
backbone.bottom_up.backbone.LayerNorm.{weight,bias} 
backbone.bottom_up.backbone.norm.{weight,bias}        

proposal_generator.rpn_head.conv.{weight,bias}   
proposal_generator.rpn_head.objectness_logits.{weight,bias}  
proposal_generator.rpn_head.anchor_deltas.{weight,bias}      

roi_heads.box_head.0.{fc1,fc2}.{weight,bias}
roi_heads.box_head.1.{fc1,fc2}.{weight,bias}
roi_heads.box_head.2.{fc1,fc2}.{weight,bias}

roi_heads.box_predictor.0.cls_score.{weight,bias}
roi_heads.box_predictor.0.bbox_pred.{weight,bias}
roi_heads.box_predictor.1.cls_score.{weight,bias}

Now, I am more confused than before, why these layers are removed? I just followed previous setting in yaml and didn't changed anything except the number of classes