Calamari-OCR / calamari

Line based ATR Engine based on OCRopy
Apache License 2.0
1.04k stars 209 forks source link

calamari-eval: unknown arguments #324

Open bertsky opened 2 years ago

bertsky commented 2 years ago

I am on Calamari 2.2.2, and when freely combining the arguments I see on --help

calamari-eval --checkpoint hsbfraktur.cala/best.ckpt.json --gt.preload false --n_worst_lines 10   --gt.texts /dev/shm/hsbfraktur.val/*.gt.txt --evaluator.progress_bar false

…I end up with the following cryptic error message…

             tfaip.util.logging: Uncaught exception
Traceback (most recent call last):
  File "/home/h1/rosa992c/my-kernel/powerai-kernel2/bin/calamari-eval", line 8, in <module>
    sys.exit(run())
  File "/home/h1/rosa992c/my-kernel/powerai-kernel2/lib/python3.7/site-packages/calamari_ocr/scripts/eval.py", line 200, in run
    main(parse_args())
  File "/home/h1/rosa992c/my-kernel/powerai-kernel2/lib/python3.7/site-packages/calamari_ocr/scripts/eval.py", line 206, in parse_args
    return parser.parse_args(args=args).root
  File "/home/h1/rosa992c/my-kernel/powerai-kernel2/lib/python3.7/site-packages/paiargparse/main_parser.py", line 93, in parse_args
    raise UnknownArgumentError(f"Unknown Arguments {' '.join(argv)}. Possible alternatives:{''.join(help_str)}")
paiargparse.dataclass_parser.UnknownArgumentError: Unknown Arguments  . Possible alternatives:
bertsky commented 2 years ago

Also: calamari-eval does not exit with non-zero in case of such errors.

andbue commented 2 years ago

This is really weird. Maybe the number of files in hsbfraktur.val is too large for the shell? Could you please try calamari-eval --checkpoint hsbfraktur.cala/best.ckpt.json --gt.preload false --n_worst_lines 10 --gt.texts "/dev/shm/hsbfraktur.val/*.gt.txt" --evaluator.progress_bar false to see if resolving the wildcard in python helps?

bertsky commented 2 years ago

Sry, I should have quoted properly. Yes, the file paths after expanding the glob expression would never fit into a memory page. So I always pass it as a glob to Calamari.

It turns out that --n_worst_lines 10 was the culprit.

Also in Calamari 1.0.5 BTW.

bertsky commented 2 years ago

Other options that are reported by --help but do not seem to work: --gt.xml_files and --pred.xml_files.

I get:

Unknown Arguments --gt.xml_files ...
Possible alternatives:
    --gt.xml_files ==> --gt.channels, --gt.texts, --n_worst_lines
andbue commented 2 years ago

Are you sure about that? calamari-eval --help on current master should not show the --gt.xml_files, only calamari-eval --gt PageXML --help shows the options relevant for PAGE.

bertsky commented 2 years ago

Other options that are reported by --help but do not seem to work: --gt.xml_files and --pred.xml_files.

Are you sure about that? calamari-eval --help on current master should not show the --gt.xml_files, only calamari-eval --gt PageXML --help shows the options relevant for PAGE.

Oh, I see! I had no idea these were nested, dependent options. It's a great way to encapsulate complexity in a CLI. Fantastic!