huggingface / pixparse

Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data
11 stars 3 forks source link

Add OCR eval task, reworked #39

Closed molbap closed 9 months ago

molbap commented 9 months ago

Should be launchable with

python -m pixparse.app.eval \
  --task cruller_eval_ocr \
  --source "pipe:aws s3 cp s3://....FUNSD_s16/FUNSD-0000{00..12}.tar -" \
  --format webdataset \
  --num-samples 400 \
  --batch-size 16 \
  --num-workers 8 \
  --model.name cruller_swin_384_to_1920 \
  --dtype bfloat16 \
  --output-dir /fsx/pablo/metrics_ocr \
  --checkpoint-path ....checkpoint-29.pt