-
To aid in the design for both of these:
- #331
- #556
I'm going to gather a bunch of examples of how different LLMs accept multi-modal inputs. I'm particularly interested in the following:
- …
-
In the OCR-D workflow, there are several steps that likely require input or output to be able to represent __word segmentation ambiguity__ and confidence values of word boundaries (whitespace characte…
-
@kba asked me to put this comment from a private Gitter conversation into an issue:
> bzgl. "input PAGE-XML not having words" wäre mein Input, dass ich damit leben kann wenn PAGE ohne Word-Elemente…
-
After messing around with this for about 8 hours over the weekend I found that "i" is consistently confused for "f". On the rare occasion "Q" is confused for "O". Sometimes restarting everything fixed…
-
AUR version of nromcap running under wayland with Hyprland WM. xdg-desktop-portal is installed, however it using the hyprland version ([xdg-desktop-portal-hyprland-git](https://github.com/hyprwm/xdg-d…
-
@bertsky:
I am missing some metadata for the following cases:
- the levels of ground truth
- antiqua font with or without "ſ", with "ß" or "ſs" or "ss" (e.g. https://www.deutschestextarchiv.de/boo…
-
Thanks for pointing out the perf impact by OCR on LiLT in your repo https://github.com/NielsRogge/Transformers-Tutorials/tree/master/LiLT where you mentioned " Please always use an OCR engine that can…
-
> running develop
> running egg_info
> writing davarocr.egg-info/PKG-INFO
> writing dependency_links to davarocr.egg-info/dependency_links.txt
> writing top-level names to davarocr.egg-info/top_le…
-
Unfortunately, since PAGE-XML completely underspecifies what and how `TextEquiv` (with or without `@index`) is used, applications have to define their own convention. IIUC (please correct me if I'm wr…
-
### Description:
Develop a Python package to generate synthetic Tibetan text images using specified fonts (excluding Uchen and Betsuk). Users can select their preferred augmentations from a predefine…