Relja / netvlad

NetVLAD: CNN architecture for weakly supervised place recognition
MIT License
529 stars 122 forks source link

Converting MATLAB trained model .mat files (off the shelf and trained on Pittsburg 30k) in Pytorch #43

Closed SharangKaul123 closed 2 years ago

SharangKaul123 commented 2 years ago

Hi, I am stuck on an issue to use '.mat' trained model files inside PyTorch module. I want to use these as baseline models for a new dataset and somehow test and find my recall values. The problem with PyTorch is that the weight checkpoints (.pth.tar) are usually saved in the form of Ordered dict with different keys and in .mat files, the weights are save in the form of numpy.ndarray. I am not able to load the model inside my script and use them for inference.

As a normal procedure, I am using scipy.io module and then 'loadmat' function to load the NetVLAD trained model '.mat' files. I suppose we need to check the shape and also possible flipped kernels, since some frameworks flip the conv kernels to use a convolution instead of the correlation. This is suggested inside this thread: https://discuss.pytorch.org/t/how-to-transfer-a-trained-model-from-matlab-to-pytorch/15903

The other option is also to convert to ONNX format and then convert to PyTorch model. The link for that is here: https://stackoverflow.com/questions/66223768/how-to-import-deep-learning-models-from-matlab-to-pytorch The problem with this approach is that you need to have MATLAB first for converting it every time to '.onnx' format and then via script to PyTorch.

Anyone with the better solution and a script to help with this would be really beneficial for future work.

Relja commented 2 years ago

Hi,

Sorry but I don't have hands on experience with this and haven't used PyTorch.

Few suggestions:

I guess things that could be problematic beyond just importing weights is if there are some slight differences in implementation across frameworks, e.g. maybe there are some tiny padding issues, or I recall reading about some nets being extremely sensitive to the exact method used when resizing images etc.