Hugging Face 'Noxilus/doctr-torch-parseq-german' model causes error

yumikim381 commented 8 months ago

Bug description

When loading 'Noxilus/doctr-torch-parseq-german' model from hug i encounter error of size mismatch

Code snippet to reproduce the bug

    # Load a custom recognition model from huggingface hub
    reco_model = from_hub('Noxilus/doctr-torch-parseq-german')

    # You can easily plug in this models to the OCR predictor
    model = ocr_predictor(det_arch='db_resnet50', reco_arch=reco_model,pretrained=True)

Error traceback

RuntimeError Traceback (most recent call last) in <cell line: 5>() 11 # Load a custom recognition model from huggingface hub 12 #reco_model = from_hub('Noxilus/doctr-torch-parseq-german') ---> 13 reco_model = from_hub('Noxilus/doctr-torch-parseq-german') 14 15 # You can easily plug in this models to the OCR predictor

1 frames /usr/local/lib/python3.10/dist-packages/doctr/models/factory/hub.py in from_hub(repo_id, kwargs) 233 if is_torch_available(): 234 state_dict = torch.load(hf_hub_download(repo_id, filename="pytorch_model.bin", kwargs), map_location="cpu") --> 235 model.load_state_dict(state_dict) 236 else: # tf 237 repo_path = snapshot_download(repo_id, **kwargs)

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict, assign) 2151 2152 if len(error_msgs) > 0: -> 2153 raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( 2154 self.class.name, "\n\t".join(error_msgs))) 2155 return _IncompatibleKeys(missing_keys, unexpected_keys)

RuntimeError: Error(s) in loading state_dict for PARSeq: size mismatch for head.weight: copying a param with shape torch.Size([196, 384]) from checkpoint, the shape in current model is torch.Size([84, 384]). size mismatch for head.bias: copying a param with shape torch.Size([196]) from checkpoint, the shape in current model is torch.Size([84]). size mismatch for embed.embedding.weight: copying a param with shape torch.Size([198, 384]) from checkpoint, the shape in current model is torch.Size([86, 384]).

Environment

Collecting environment information...

DocTR version: v0.8.1 TensorFlow version: 2.15.0 PyTorch version: 2.2.1+cu121 (torchvision 0.17.1+cu121) OpenCV version: 4.8.0 OS: Ubuntu 22.04.3 LTS Python version: 3.10.12 Is CUDA available (TensorFlow): No Is CUDA available (PyTorch): No CUDA runtime version: 12.2.140 GPU models and configuration: Could not collect Nvidia driver version: Could not collect cuDNN version: Probably one of the following: /usr/lib/x86_64-linux-gnu/libcudnn.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.9.6

Deep Learning backend

is_tf_available: False is_torch_available: True

felixdittrich92 commented 8 months ago

Hi @yumikim381 :wave:, I checked this and it looks like that the user made an mistake while uploading the model (the vocab in the config file seems to differ to the one which was used for training) Unfortunately we can't do anything about this the owner of this shared model needs to fix it.

You can use: tilman-rassy/doctr-crnn-vgg16-bn-fascan-v1 (trained on german + french vocab) or Felix92/doctr-torch-parseq-multilingual-v1 (was fine tuned on multilingual vocab https://mindee.github.io/doctr/modules/datasets.html#supported-vocabs) instead.

Or train your own model you can generate some synth data with: https://github.com/felixdittrich92/synthtiger/tree/doctr-modified (It's not really clean yet but it works :sweat_smile: I have planned to clean and simplify this in the upcoming weeks)

yumikim381 commented 8 months ago

Thanks a lot for your reply!! :D

felixdittrich92 commented 8 months ago

So i think we can close this :)

mindee / doctr