Brion112233 commented 6 months ago

Instructions To Reproduce the 🐛 Bug:

what changes you made (git diff) or what code you wrote


To train on my own dataset, I used the following code to change the pre-trained model (ordinal number: 0) in Model Zoo and successfully trained a detection model. When I was doing the second step of the training split head, I came across the bug shown in the title

the code:

import torch pretrained_weights = torch.load('./weights/detr-r50-e632da11.pth')

num_classes: 5

num_class = 6 # num_classes + 1 pretrained_weights["model"]["classembed.weight"].resize(num_class+1, 256) pretrained_weights["model"]["classembed.bias"].resize(num_class+1) torch.save(pretrainedweights, "detr-r50%d.pth"%num_class)

2. what exact command you run:

when train the instance segmentation head,I used the command in the terminal:

python main.py --masks --epochs 25 --lr_drop 15 --coco_path /path/to/coco --frozen_weights /output/path/box_model/checkpoint.pth --output_dir /output/path/segm_model

3. about bug

Missing key(s) in state_dict: ......(A large string of text about the structure of the model ) Unexpected key(s) in state_dict: ......(A large string of text about the structure of the model )

4. Some question：
1.Whether the cause of the error is that I modified the pretrained model？
2.How do I fix this bug?

look forward to hearing from you, thank you very much！

## Environment:

PyTorch version: 1.10.1+cu113 Is debug build: False CUDA used to build PyTorch: 11.3 ROCM used to build PyTorch: N/A

Versions of relevant libraries: [pip3] numpy==1.21.6 [pip3] torch==1.10.1+cu113 [pip3] torch-tb-profiler==0.4.3 [pip3] torchaudio==0.10.1+cu113 [pip3] torchsummary==1.5.1 [pip3] torchvision==0.11.2+cu113 [conda] numpy 1.21.6 py37h976b520_0 conda-forge [conda] torch 1.10.1+cu113 pypi_0 pypi [conda] torch-tb-profiler 0.4.3 pypi_0 pypi [conda] torchaudio 0.10.1+cu113 pypi_0 pypi [conda] torchsummary 1.5.1 pypi_0 pypi [conda] torchvision 0.11.2+cu113 pypi_0 pypi

Brion112233 commented 6 months ago

Just now，I visualized the pretrained model in modelzoo and the model changed.I find there is only dimensional difference between them in class_embed. So I read the bug in detail，and noticed that "Missing key(s) in state_dict" contains: Encoder,Decoder,Backbone,Mask_head and some embedding layers "Unexpected key(s) in state_dict" contains: Encoder,Decoder,Backbone and some embedding layers

Is there a problem with my steps?

Brion112233 commented 6 months ago

I have found the reason!When training the box model,I add the content of 'resume' in the main.py,and I just forgot to delete it when training the seg head......

facebookresearch / detr

RuntimeError: Error(s) in loading state_dict for DETRsegm: #621

Instructions To Reproduce the 🐛 Bug:

num_classes: 5