facebookresearch / nougat

Implementation of Nougat Neural Optical Understanding for Academic Documents
https://facebookresearch.github.io/nougat/
MIT License
8.81k stars 561 forks source link

How to use the largest most capable highest quality model for OCR with nougat? #102

Closed brando90 closed 1 year ago

brando90 commented 1 year ago

currently small is used:

(maf) brando9@ampere1~/data/maf_data/maf_pdfs $ for pdf in $pdfs; do

mkdir -p $DESTINATION/$pdf/ nougat $SOURCE/$pdf -o $DESTINATION/$pdf/ --markdown done

downloading nougat checkpoint version 0.1.0-small to path /lfs/ampere1/0/brando9/.cache/torch/hub/nougat-0.1.0-small config.json: 100%|███████████████████████████████████████████████████████████████████████| 557/557 [00:00<00:00, 3.07Mb/s] pytorch_model.bin: 100%|███████████████████████████████████████████████████████████████| 956M/956M [00:13<00:00, 76.0Mb/s] special_tokens_map.json: 100%|██████████████████████████████████████████████████████████| 96.0/96.0 [00:00<00:00, 641kb/s] tokenizer.json: 100%|████████████████████████████████████████████████████████████████| 2.04M/2.04M [00:00<00:00, 38.1Mb/s] tokenizer_config.json: 100%|██████████████████████████████████████████████████████████████| 106/106 [00:00<00:00, 739kb/s]

how to use the larger one?

# Download larger model 
nougat_checkpoint="nougat-0.1.0"

torchhub nougat download --checkpoint $nougat_checkpoint

# Rest of script...

for pdf in $pdfs; do
  mkdir -p $DESTINATION/$pdf/

  # Use larger model
  nougat $SOURCE/$pdf -o $DESTINATION/$pdf/ --checkpoint $nougat_checkpoint --markdown
done

works?

lukas-blecher commented 1 year ago

instead, just call

nougat $SOURCE/$pdf -o $DESTINATION/$pdf/ --markdown --model 0.1.0-base
brando90 commented 1 year ago

instead, just call

nougat $SOURCE/$pdf -o $DESTINATION/$pdf/ --markdown --model 0.1.0-base

@lukas-blecher thank you! Curious, how big is it?

lukas-blecher commented 1 year ago

250M vs 350M parameters or 1.3GB

brando90 commented 1 year ago

How do I make sure I'm using the 1.3B?

On Wed, Sep 20, 2023, 3:01 PM Lukas Blecher @.***> wrote:

250M vs 450M parameters or 1.3GB

— Reply to this email directly, view it on GitHub https://github.com/facebookresearch/nougat/issues/102#issuecomment-1728419789, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOE6LQJ26W5XRABKMSEFJTX3NKTHANCNFSM6AAAAAA5AJJJQ4 . You are receiving this because you authored the thread.Message ID: @.***>

lukas-blecher commented 1 year ago

instead, just call

nougat $SOURCE/$pdf -o $DESTINATION/$pdf/ --markdown --model 0.1.0-base

by adding the argument --model 0.1.0-base when calling nougat

brando90 commented 1 year ago

instead, just call

nougat $SOURCE/$pdf -o $DESTINATION/$pdf/ --markdown --model 0.1.0-base

by adding the argument --model 0.1.0-base when calling nougat

sorry I was unclear, I wanted to double check it was the 1.3B param model but your comment confirms it. Thank you!

lukas-blecher commented 1 year ago

Just to clear this up, it's 1.3gigabytes large and 350M parameters (see paper for more info)