facebookresearch / nougat

Implementation of Nougat Neural Optical Understanding for Academic Documents
https://facebookresearch.github.io/nougat/
MIT License
8.98k stars 567 forks source link

Torch.Size Error #16

Closed litetoooooom closed 1 year ago

litetoooooom commented 1 year ago
  1. download 0.1.0-base
  2. pip install nougat-ocr
  3. nougat test.pdf -c nougat_model_base -o ./

ERROR Message: File "/path/lib/python3.9/site-packages/predict.py", line 78, in main model = NougatModel.from_pretrained(args.checkpoint).to(torch.bfloat16) File "/path/lib/python3.9/site-packages/nougat/model.py", line 682, in from_pretrained model = super(NougatModel, cls).from_pretrained( File "/path/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2379, in from_pretrained ) = cls._load_pretrained_model( File "/path/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2695, in _load_pretrained_model raise RuntimeError(f"Error(s) in loading state_dict for {model.class.name}:\n\t{error_msg}") RuntimeError: Error(s) in loading state_dict for NougatModel: size mismatch for decoder.model.model.decoder.embed_positions.weight: copying a param with shape torch.Size([4098, 1024]) from checkpoint, the shape in current model is torch.Size([3586, 1024]). You may consider adding ignore_mismatched_sizes=True in the model from_pretrained method.

lukas-blecher commented 1 year ago

I was unable to reproduce the issue. Do you have the correct config.json?

runrunrun1994 commented 1 year ago

I also encountered similar issues when using small weights torch==1.13.0 timm==0.9.6 transformers==4.32

raise RuntimeError(f"Error(s) in loading state_dict for {model.class.name}:\n\t{error_msg}") RuntimeError: Error(s) in loading state_dict for NougatModel: size mismatch for encoder.model.layers.1.downsample.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.model.layers.1.downsample.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for encoder.model.layers.1.downsample.reduction.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([256, 512]). size mismatch for encoder.model.layers.2.downsample.norm.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for encoder.model.layers.2.downsample.norm.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for encoder.model.layers.2.downsample.reduction.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([512, 1024]). You may consider adding ignore_mismatched_sizes=True in the model from_pretrained method.

QUEST2179 commented 1 year ago

I also encountered this size mismatch issue, anyone solved this problem yet? Thanks!

lukas-blecher commented 1 year ago

Can you downgrad timm to timm==0.5.4 and try again?