-
## Overview
When using certain PSMs with certain inputs, the `PageIterator::Baseline` function produces results that are incorrect due to a bug when getting line bounding boxes. I noticed this when …
-
How do we encode that a specific component used a specific model?
For example, ocrd_tesserocr uses `frm` model, where do we store that information?
This should be part of the overall provenance …
-
[paper](https://arxiv.org/pdf/2310.03744.pdf)
see llava https://github.com/long8v/PTIR/issues/128#issue-1749571159 here
## TL;DR
- **I read this because.. :** aka LLaVA1.5 / ShareGPT4V에서 LL…
-
I start this issue to collect my experiences when trying to train Tesseract from GT4HistOCR using ocrd-train. So problems reported here can be caused by Tesseract, ocrd-train or by GT4HistOCR.
-
*UPDATE: There are additional items added for `mode` and `source` (see [here](https://github.com/sbs20/scanservjs/blob/86fab6783f74997e368a2e5a17f67cbd95c3783b/webui/src/locales/en.json#L39)) - additi…
sbs20 updated
10 months ago
-
hOCR is easy to implement because it's based on HTML but it can hardly be called a standard while there are living standards for OCR like ALTO.
hOCR is used by Open Source engines like tesseract, ocr…
-
If it is not outside of the scope of the addon, I would like to request an OCR feature. I work with scientific papers a lot and many of the old ones are just jpegs converted to pdfs. It would be nice …
-
Hello,
Spent way too much this on trying to figure this out. I hope someone wanted to do the same and managed to do it.
Using docker-compose file and everything is loaded and reachable at "http:…
-
I have a METS where all FLocats are LOCTYPE=URL (as required by DFGViewer), but local directories FULLTEXT and MAX do exist as well.
Unfortunately, digital-derivans does not seem to like this repre…
-
As I've been testing our UI with a screen reader, I've noticed some small things that probably pass official a11y guidelines but would be nice to fix for screen readers if we have the time to make the…