-
In the postprocessing script I think there should be an alignment before the refinement by nms as well? Because its possible that two table column objects may not overlap at all before alignment but t…
-
Hi, would extracting images be considered part of the scope of GROBID?
e.g. current extraction of formulas, figures and tables is really bad as you know. Until we have a more confident extraction, …
-
### Search before asking
- [X] I searched in the [issues](https://github.com/apache/incubator-paimon/issues) and found nothing similar.
### Motivation
support debezium-json format with schema for …
-
I am tring to [generate dataset](https://github.com/facebookresearch/nougat#generate-dataset), including process `.tex` to `.html` by LaTeXML and run `nougat.dataset.split_htmls_to_pages`, but I got s…
-
Hi, in line 98 of **_generate_samples.py_**, I found that **_doc_** is nothing after **_Document(xml_file)_**, is there something wrong with the format of the xml files? Can you give a xml file as an …
-
There is already some table detection mechanism in tesseract but unfortunately, there is seems to be no possibility to access the table structure at the API.
This could be done only minimal changes…
-
Hi,
I tried to search on your website, also here within the code; basically, how can I figure out what algorithm are you using for detecting, extracting tables and turning them into html tags. Or …
-
I see in the config file [/model_artifacts/tableformer/tm_config.json](https://huggingface.co/ds4sd/docling-models/blob/main/model_artifacts/tableformer/tm_config.json) the dataset name mentioned is "…
-
There are many plantuml undocumented features and settings captured in the [forum (Q&A)](http://forum.plantuml.net)
Many of them also don't appear with `java -jar plantuml.jar -language`.
The list…
-
## Comportement attendu
Lorsque des dates et/ou horaires sont présents dans les données d'une mesure sur Eudonet Paris, ils sont reflétés dans les données DiaLog
## Comportement réel
Les …