wisdomikezogwo / quilt1m

[NeurIPS 2023 Oral] Quilt-1M: One Million Image-Text Pairs for Histopathology.
https://quilt1m.github.io/
MIT License
138 stars 8 forks source link

Error on loading QuiltNet-B-16 #20

Closed SongDoHou closed 7 months ago

SongDoHou commented 9 months ago

Hi, The error is occurred from below command:

from transformers import CLIPModel model = CLIPModel.from_pretrained("wisdomik/QuiltNet-B-16", use_auth_token=None) .

The error msg is:

RuntimeError: Error(s) in loading state_dict for CLIPModel:
    size mismatch for vision_model.embeddings.patch_embedding.weight: copying a param with shape torch.Size([768, 3, 16, 16]) from checkpoint, the shape in current model is torch.Size([768, 3, 32, 32]).
    size mismatch for vision_model.embeddings.position_embedding.weight: copying a param with shape torch.Size([197, 768]) from checkpoint, the shape in current model is torch.Size([50, 768]).
    You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.

Is there any issues with that? or my dev environment has something wrong?

wisdomikezogwo commented 9 months ago

Hi, try to use this:

import open_clip

model, preprocess_train, preprocess_val = open_clip.create_model_and_transforms('hf-hub:wisdomik/QuiltNet-B-16')
tokenizer = open_clip.get_tokenizer('hf-hub:wisdomik/QuiltNet-B-16')
byte-dance commented 8 months ago

I wonder 'pytorch_model.bin' and 'open_clip_pytorch_model.bin' which one is the right bin that you trainen yourself?

wisdomikezogwo commented 7 months ago

Training outputs open_clip_pytorch_model.bin and we convert it to pytorch_model.bin also for compatibility with loading the models directly using HF CLIP modules and not just open_clip. So both are the same and trained as discussed in the paper.