alto-xml Search Results

1000+ results
for alto-xml

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

mittagessen/kraken #624

Failure to read varying text sizes in a Newspaper page

The zip includes the model, the code and the resulting HTML. The problem is that the get.text works well for a specific size of letters and fails on others. There is very high diversity in font size…

sinairusinek updated 4 months ago
7
kermitt2/grobid #950

[feature request] to skip FullTextParse on certain page regi…

The provided model cannot correctly categorize some "vaguely" plotted Figures and Tables. In this case, the word in the Table region will be considered as normal Text, thus hinder the normal reading o…

frankang updated 2 years ago
3
PonteIneptique/YALTAi #23

YALTAi 2.0.1 hangs forever when segmenting

YALTAi 2.0.1 hangs forever when segmenting ``` (train-2.0.1-py3.11) incognito@DESKTOP-H1BS9PO:~/YALTAi$ yaltai kraken -I "*.jpg" --suffix ".xml" segment --yolo runs/detect/train2/weights/best.pt …

johnlockejrr updated 1 month ago
46
moshekaplan/palo_alto_firewall_analyzer #80

UnusedAddresses flagging used addresses for deletion

I will try to get a clean example of this but came across this package and wanted to give it a test. However several of the addresses it flagged from the "shared" device group are in fact in use direc…

shepherdjay updated 9 months ago
1
kermitt2/grobid #313

Word document instead of PDF

Hi, At present, I have all documents as DOCX (Microsoft Word files) which I convert to PDF in order to run the GROBID XML conversion. Is there any possibility of using DOCX as input? In case of …

sarankup updated 4 years ago
5
dbmdz/mirador-textoverlay #294

ALTO exists but Overlay-Tool is not visible

I build my own Mirador with textoverlay-plugin 0.3.8: ``` import Mirador from 'mirador/dist/es/src/index'; import downloadDialogPlugin from 'mirador-downloaddialog/es'; import imageCropperPlugin…

datazuul updated 1 year ago
8
OCR-D/core #544

RFC: ocrd-sanitize script to preprocess/postprocess OCR-D wo…

METS/PAGE/ALTO provided by digitization workflow software or repositories will not always adhere [to the conventions we have in OCR-D](https://ocr-d.de/en/spec). OTOH the workspaces that are the resul…

kba updated 3 years ago
12
OCR-D/page-to-alto #1

TableRegion should become ComposedBlock

https://github.com/kba/page-to-alto/blob/46a8cc2fb74ce327e9d195f1095699cbae946cce/ocrd_page_to_alto/convert.py#L158 I think it's not enough to just map the lower levels here. There might not be any…

bertsky updated 8 months ago
7
openpreserve/jhove #745

XML Extraction failing wih SaxParseException

Hi, I have an XML files that is failing with below error. Error/s returned during metadata extraction (SaxParseException: java.lang.ClassCastException: class sun.net.www.protocol.file.FileURLCo…

rgalv updated 8 months ago
3
cisocrgroup/ocrd_cis #84

Make deskewing efficient+robust, and add orientation

As outlined a while ago, https://github.com/cisocrgroup/ocrd_cis/blob/c3fad1a8b04dc5a305460e8bb3c54cb79cd75515/ocrd_cis/ocropy/common.py#L111-L118 there are plenty of opportunities to improve `o…

bertsky updated 3 years ago
1

上一页 1...7 8 9 10 11 12 13...100 下一页

1000+ results for alto-xml

1000+ results
for alto-xml