ZhanYang-nwpu / Mono3DVG

[AAAI 2024] Mono3DVG: 3D Visual Grounding in Monocular Images, AAAI, 2024
33 stars 1 forks source link

About training with pretrained model #8

Open ttomtom6 opened 6 months ago

ttomtom6 commented 6 months ago

Thanks for your dataset and code! It really helps me a lot. When I want to train the network with your provided checkpoint_best_MonoDETR.pth, I meet the following error:

size mismatch for class_embed.0.weight: copying a param with shape torch.Size([3, 256]) from checkpoint, the shape in current model is torch.Size([9, 256]).
size mismatch for class_embed.0.bias: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([9]).
size mismatch for query_embed.weight: copying a param with shape torch.Size([50, 512]) from checkpoint, the shape in current model is torch.Size([1, 256]).

I have tried to ignore this error with following code but obtain very low results:

current_model_dict = model.state_dict()
new_state_dict = {k: v if v.size() == current_model_dict[k].size() else current_model_dict[k] for k, v in zip(current_model_dict.keys(), new_state_dict.values())}

Can you provide some advice? Thanks a lot!

ZhanYang-nwpu commented 6 months ago

query_embed

It sounds like you didn't directly use the complete code I provided to train the network. Please use my complete code and parameters for training, this issue should not arise.

Marloweeee commented 5 months ago

In fact, there is a problem with the pretraining weight configuration in the trainer configuration file. Just comment "pretrain_model" like this

trainer:
  max_epoch: 60
  gpu_ids: '0'
  detr_model: 'checkpoint_best_MonoDETR.pth'
  save_frequency: 1    # checkpoint save interval (in epoch)
  resume_model: False
  # pretrain_model: configs/checkpoint_best_MonoDETR.pth'
  save_path: 'outputs/'