Some wrongs with the pre-trained model crossformer-b.pth

cheerss / CrossFormer

The official code for the paper: https://openreview.net/forum?id=_PHymLIxuI

MIT License

360 stars 43 forks source link

Some wrongs with the pre-trained model crossformer-b.pth #9

Closed hongsheng-Z closed 2 years ago

hongsheng-Z commented 2 years ago

Hi, thanks for your great work. I am using your crossformer_base as my backbone network for downstream tracking tasks. But now when I load your pre-trained model, a very correct Unexpected key(s) appears. My loading code is as follow: ckpt = torch.load(ckpt_path, map_location='cpu') missing_keys, unexpected_keys = backbone.body.load_state_dict(ckpt['model'], strict=False)

The result as follow: unexpected keys: ['norm.weight', 'norm.bias', 'head.weight', 'head.bias', 'layers.0.blocks.0.attn.biases', 'layers.0.blocks.0.attn.relative_position_index', 'layers.0.blocks.1.attn.biases', .....

cheerss commented 2 years ago

It is reasonable that there are some unexpected keys.

Wherein, 'norm.weight', 'norm.bias', 'head.weight', 'head.bias' are weights of the classification head, so the downstream models do not contain these weights. Others, e.g., 'layers.0.blocks.0.attn.biases', are indices for the DPB module, and these indices are non-trainable parameters which can also be computed dynamically without affecting the results. (They are computed dynamically for downstream models here to support dynamic image size)

hongsheng-Z commented 2 years ago

Thanks for your reply, it helped me a lot.