Open mar-hum opened 2 months ago
Hello Marco, I'm sorry for responding only now, I missed your issue.
Based on the documents available in the dataset repository, I suggest adding the following elements:
transcription-guidelines: >-
Transcription rules can be found alongside the dataset. They include the
following rules:
- Exclusion of overwritten text from training data
- Exclusion of text not identified by the automated layout recognition
- Exclusion of faded text
- Inserted words are treated as separate text lines
- Exclusion of textual features such as dotted lines
- Base line separation for text written apart
I already added them in the pull request I opened. Is that ok?
Hi Alix,
No worries! Thank you very much that's brilliant. Please let me know if you need anything else.
Best wishes
Hello,
Could you please add the Sloane Lab HTR Model to the HTR United repository?
Many thanks and best wishes Marco
Here is our dataset YAML file: