OML-Team / open-metric-learning

Metric learning and retrieval pipelines, models and zoo.
https://open-metric-learning.readthedocs.io/en/latest/index.html
Apache License 2.0
890 stars 61 forks source link

Reasoning behind deleting ResnetExtractor weights for state_dict layer4 #610

Closed Harshagarwal19 closed 4 months ago

Harshagarwal19 commented 5 months ago

Hello, Thanks for the amazing work! I have a question regarding the reason behind removing the pre-trained weights for layer4 in ResnetExtractor.

https://github.com/OML-Team/open-metric-learning/blob/main/oml/models/resnet/extractor.py#L129

        state_dict = remove_prefix_from_state_dict(state_dict, "layer4.")  # type: ignore

Can you elaborate why the parameters for layer4 are removed?

AlekseySh commented 5 months ago

@Harshagarwal19 Thank you! There is a misunderstanding, we don't remove the layer :) We just remove a prefix in the keys of the state dict, so the content remains the same.

Check it out: https://github.com/OML-Team/open-metric-learning/blob/bbf3562bb217b8cca863a283d3b88eee51b0d4f9/tests/test_oml/test_models/test_utils.py#L4

Harshagarwal19 commented 5 months ago

Got it! In that case it seems like an exception handling? So that state_dict keys matches the model weight names... Can you point out an instance where this is actually needed? Because I checked for resnet50 case, it seems to have no effect.

AlekseySh commented 5 months ago

it seems to have no effect.

That's expected in most of the cases, but when you train your model in DDP the prefix "model." will be added automatically. So, you will have problems when you try to load your model outside of DDP.

Another example: you may train your model as a part of bigger model, for example, you could have image and text encoders, but if you only need image encoder it also makes sense to remove the shared prefix if any.

Harshagarwal19 commented 4 months ago

Makes sense! Thanks a lot for your explanation.

AlekseySh commented 4 months ago

You are welcome.