-
# Project Proposal
Please fill in the details below to confirm you adhere to the [Jazzband Guidelines](https://jazzband.co/about/guidelines), and also add the package name to the issue title.
##…
-
I looked through the code and the current PDF loader used is PyMuPDF. Within the free libraries, PDFMiner works better than PyMuPDF and PyPDF so it would be good to have it. Additionally, documents th…
-
Hi,
I was wondering if there is any interest in adding support for AWS textract for extracting text / tables ? I noticed there is already an option for a similar offering from Azure (AzureConverter…
-
**Describe the bug**
DEPRECATION: textract 1.6.5 has a non-standard dependency specifier extract-msg
-
Typically, it's best practice for Python logging to use `logging.getLogger(__name__)`.
However, the ResponseParser simply does `import logging` and then `logging.info(...)` - this results in the ro…
-
Extract text from a pdf file that is already uploaded on s3 bucket.
-
### Description
Amazon Textract recently released the ability to create [Custom Queries](https://aws.amazon.com/about-aws/whats-new/2023/10/amazon-textract-custom-queries-information-extraction-bus…
-
If possible can the `~=` be replaced with `>=` I cannot install this library in a big project with many other depenencies
https://github.com/deanmalmgren/textract/blob/ec3c0c3c982078d22e51cc2753baeaf…
-
attached the part of the pdf, which I am trying to extract.
I am doing extraction using:
textract_json = call_textract(input_document="s3:url",
features=[Textract_Featur…
-
![image](https://github.com/run-llama/llama_parse/assets/3716307/07f1f363-9a15-44b2-90f9-9ee5afb9c4ec)
I am curious about what the red highlight mean on this picture and notably for Textract. The o…