facebookresearch / nougat

Implementation of Nougat Neural Optical Understanding for Academic Documents
https://facebookresearch.github.io/nougat/
MIT License
8.81k stars 561 forks source link

Layer size Mismatch while loading the model #136

Closed neilgautam closed 11 months ago

neilgautam commented 12 months ago

While loading the Nougat 0.1.0-base model, I'm getting the error below. I'm facing the same error while using the 0.1.0-small checkpoint. I'm using the predict.py file from the latest cloned repository without making any changes in the code, still i'm facing this issue.

(/home/jupyter/system_config/nougat_hf) neil@instance-1:/home/jupyter/nougat-processing/nougat$ python predict.py --model "0.1.0-small" --out "/home/jupyter/nougat-processing/nougat/output" --pdf "/home/jupyter/annual_reports_pdf_year_wise/2020" downloading nougat checkpoint version 0.1.0-small to path /home/neil/.cache/torch/hub/nougat-0.1.0-small config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 557/557 [00:00<00:00, 1.87Mb/s] pytorch_model.bin: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 956M/956M [00:57<00:00, 17.5Mb/s] special_tokens_map.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 96.0/96.0 [00:00<00:00, 415kb/s] tokenizer.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.04M/2.04M [00:00<00:00, 8.57Mb/s] tokenizer_config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 106/106 [00:00<00:00, 449kb/s] INFO:root:Found 4233 files. /home/neil/.cache/torch/hub/nougat-0.1.0-small /home/jupyter/system_config/nougat_hf/lib/python3.9/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3483.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] Traceback (most recent call last): File "/home/jupyter/nougat-processing/nougat/predict.py", line 212, in main() File "/home/jupyter/nougat-processing/nougat/predict.py", line 128, in main model = NougatModel.from_pretrained(args.checkpoint) File "/home/jupyter/nougat-processing/nougat/nougat/model.py", line 684, in from_pretrained model = super(NougatModel, cls).from_pretrained( File "/home/jupyter/system_config/nougat_hf/lib/python3.9/site-packages/transformers/modeling_utils.py", line 3301, in from_pretrained ) = cls._load_pretrained_model( File "/home/jupyter/system_config/nougat_hf/lib/python3.9/site-packages/transformers/modeling_utils.py", line 3750, in _load_pretrained_model raise RuntimeError(f"Error(s) in loading state_dict for {model.class.name}:\n\t{error_msg}") RuntimeError: Error(s) in loading state_dict for NougatModel: size mismatch for encoder.model.layers.1.downsample.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.model.layers.1.downsample.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.model.layers.1.downsample.reduction.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([256, 512]). size mismatch for encoder.model.layers.2.downsample.norm.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for encoder.model.layers.2.downsample.norm.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for encoder.model.layers.2.downsample.reduction.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([512, 1024]). You may consider adding ignore_mismatched_sizes=True in the model from_pretrained method.

lukas-blecher commented 12 months ago

what timm version do you have installed?

neilgautam commented 12 months ago

hey @lukas-blecher , This issue is solved now. When i used the timm-0.4.12, this problem got solved. Initially, i was using the latest version which was causing the issue because of naming convention.