cvg / glue-factory

Training library for local feature detection and matching
Apache License 2.0
755 stars 98 forks source link

Mismatched model shape upon MegaDepth finetuning #64

Closed mattiasmar closed 9 months ago

mattiasmar commented 9 months ago

I trained LightGlue on the homographies and then moved on to finetune against MegaDepth. In the megadepth training script., just at the very start, when loading the model state, the training script throws this exception:

Exception has occurred: RuntimeError
Error(s) in loading state_dict for TwoViewPipeline:
    size mismatch for matcher.posenc.Wr.weight: copying a param with shape torch.Size([32, 2]) from checkpoint, the shape in current model is torch.Size([32, 4]).
  File "/mnt/nvme0n1/data/home/mmarder/glue-factory/gluefactory/models/base_model.py", line 133, in load_state_dict
    ret = super().load_state_dict(*args, **kwargs)

Any ideas what to check? What could have gone wrong? I don't think I changed anything substantially. I just played a bit with feature types and batch size.

mattiasmar commented 9 months ago

I've repeated this experiment with SIFT and I got the same error:

Exception has occurred: RuntimeError
Error(s) in loading state_dict for TwoViewPipeline:
    size mismatch for matcher.posenc.Wr.weight: copying a param with shape torch.Size([32, 2]) from checkpoint, the shape in current model is torch.Size([32, 4]).

However, with superpoint and aliked the exception does not occur. One obvious difference is that SIFT doesn't have a torch model, whereas SP and ALIKED do. Could that break the TwoViewPipeline?

mattiasmar commented 9 months ago

The parameter "model.matcher.add_scale_ori (in the config yaml) must be set to the same value as used during (pre)training against the homography task (obviously).

The default value of this parameters is 'false'. When set to true the number of in-features of the Learnable Fourier Positional Encoding becomes 4 otherwise it becomes 2.