-
**Is your feature request related to a problem? Please describe.**
The information extraction feature currently works with PDF documents as source.
We want to expand the sources also to text fields …
-
**Is your feature request related to a problem? Please describe.**
The information extraction feature currently works with PDF documents as source.
We want to expand the sources also to text fields …
-
1) does gmft contains any function set_cropbox similar to present in similar to present in pymupdf.
2) does gmft has functions which can read pdf and seprate non tabular data from tabular data like …
-
I am using Camelot for table extraction in PDF documents, which generally works well for my needs. However, I've encountered a recurring issue where the first and last rows of tables cause problems du…
-
I'm are using a custom Layout Parser model, which is registered and has text, title, table. ... as categories.
I am trying to use pdfplumber detector and textextractionservice.
Code :
```
…
-
### Description
There is another tool for PDF table extraction recently, maybe this could be an option to embed?
https://github.com/ai8hyf/TF-ID
-
## Describe the bug
Thanks for the latest and greatest update. Deployed everything and restarted my tests. However with the local qdrant instance I seem to still run in the same issue. I have a sin…
-
Please describe, in as much detail as possible, your proposal and how it would improve your experience with pdfplumber.
So while extracting tables from a pdf there are pdf which has mered cells in th…
-
### Use case
Implement a middleware that exposes the Textract capabilities within a Lakechain document processing pipeline.
### Solution/User Experience
Below is the temporary design for an A…
-
Hi Eduard,
Thank you for creating such a powerful package!
I wonder if you plan to extend the PDF extraction functionality in `llm_message()` to automatically detect whether the PDF is multi-col…