qurator-spk / dinglehopper

An OCR evaluation tool
Apache License 2.0
58 stars 12 forks source link

Add batch processing and report summaries #83

Closed rfdj closed 1 year ago

rfdj commented 1 year ago

In this PR, we added several features:

The README has been updated accordingly. This allows us to easily evaluate OCR results in a more systematic way. Please let us know if it's of any interest to you.

mikegerber commented 1 year ago

Hi Ruud,

thanks for the PR! Very good ideas!

I think I am going to add this, but I also believe it needs some more work:

A)

B)

I'll work on A first (possibly leaving out OCR-D for now)

mikegerber commented 1 year ago

Working in https://github.com/qurator-spk/dinglehopper/tree/pr-83 (can't push to INL:feat/batch-processing, it seems).

mikegerber commented 1 year ago

@rfdj Could you check if you can "Allow edits from maintainers" for this PR? (It may not be possible due to https://github.com/orgs/community/discussions/5634, though)

mikegerber commented 1 year ago

I've fixed a bug in dinglehopper-summarize when the reports do not contain any difference statistics here:

7c323e1

mikegerber commented 1 year ago

@rfdj Because I'm going on vacation I decided to merge the PR as is (to not block it for more weeks) and will take care of the points I mentioned after my vacation :)

Thanks for the contribution, I think this will be useful for the users!

rfdj commented 1 year ago

I was away for a few days myself, but let me know if I can still be of assistance somewhere when you come back.

mikegerber commented 10 months ago

I've noticed that I had a small fix for this in 7c323e1af0124768f1f675b591187429c689bff8 (= f077ce2 in master) that I hadn't merged yet and did so today. (Summarizing threw an Exception if the reports didn't have the difference stats.)