Open ioangatop opened 1 week ago
@ioangatop if you want classifier weights loaded into feature extraction wrapped models, you need to load weights as 'pretrained' so that they are loaded before the model is mutated.
See related discussion, should work with >= 0.9 timm version https://github.com/hugginface/pytorch-image-models/discussions/1941
Although, example in that discussion should be a bit differentl, use the 'overlay' arg as in the train script https://github.com/huggingface/pytorch-image-models/blob/d4ef0b4d589c9b0cb1d6240ff373c5508dbb8023/train.py#L463-L468
The overlay dict is merged with the models normal pretrained_cfg, the pretrained_cfg arg fully overrides it.
Alternative to using the file
key in the pretrained_cfg override dict, you can also use url
to download from somewhere else, or hf_hub_id
for a HF hub location.
Describe the bug
Hi Ross! I'm facing a small issue with the features extractor, here are some details:
The function
create_model
supports the argument ofcheckpoint_path
which allows to load custom model weights. However, when we want to load a model as feature extractor, the model is wrapped around theFeatureGetterNet
class, and the loading fails as the keys do not much anymore; theFeatureGetterNet
stores the model underself.model
so in order to work, the state dict keys should have a prefixmodel.
, for exampleclass_token
->model.class_token
Additionally, one workaround is to do the loading of the model after the initialisation, but this also fails as some networks, like vision transformer, prune some layers and thus the state_dict has extra keys
To Reproduce
As always, thanks a lot 🙏