-
### Requested feature
Enhanced table extraction for complex table formats. Currently, Docling is able to identify the values correctly, but formatting is sometimes misaligned or unclear, especially i…
-
How can I use paddlepaddle for table extraction? I can't find a clear procedure to do so.
-
Hello,
Thank you so much for continuing the development of camelot! I'm glad to see that camelot continues to be maintained.
I happen to also manage a pdf extraction library, [gmft](https://git…
-
Add table extraction benchmark.
-
### Description
There is another tool for PDF table extraction recently, maybe this could be an option to embed?
https://github.com/ai8hyf/TF-ID
-
Develop a formatter to parse PDF and DOCX files, extract text and tables while handling complex layouts.
- [ ] Research methods of text extraction from PDF and DOCX.
- [ ] Implement Basic Parsing …
-
Hello everyone, I have noticed that many tables in the literature are rotated, while some are not. How can I determine whether a table has been rotated before performing content recognition and extrac…
-
Hi team,
Thank you so much for maintaining this package!
I have a few questions though as I have not found those simple answers in the documentation.
1. Do we need to uninstall a Camelot in…
-
Thank you for the initiative. I am using it for table extraction and it is returning tables/dataframes as expected. However, it is not giving complete text in some rows or providing text in multiple l…
-
Hi,
PDF files are converted to DOCX and then tables are extracted from DOCX.
There are hidden columns and hidden text in the tables.
Is there a way to ignore the hidden columns and text during co…