-
**Is your feature request related to a problem? Please describe.**
For all my expenses I'm having PDFs, but in InvoiceShelf I have to add each and every expense manually.
**Describe the solution you'…
-
### **Description**:
I'm using Docling to parse a PDF that contains text. The PDF appears to use a non-standard font or encoding, as copying text directly from it also yields garbled characters. …
-
Requesting a version of PDF OCR that only runs tesseract OCR on embedded images in PDF instead of capturing the whole page of the PDF.
A lot of my professors use powerpoints converted to PDF, the t…
-
Trying to run docker-compose -f docker-compose.gpu.yml up -d --build
Have this output:
python client/cli.py ocr --file test.pdf --prompt_file .\examples\parse-table.txt
Namespace(command='ocr',…
-
`Docling version: 2.3.1
Docling Core version: 2.3.1
Docling IBM Models version: 2.0.3
Docling Parse version: 2.0.2
`
Fine software: I have compared it with many alternatives. (BTW, for some new…
-
```
[](https://localhost:8080/#) in extract_data_from_pdf(pdf_path)
57 # Function to extract text using the unstructured library
58 def extract_data_from_pdf(pdf_path):
---> 59 eleme…
-
This project uses RapidOCR for image OCR and Fitz in the PyMuPDF package for PDF OCR. To be honest, it is extremely difficult to recognize tables in some PDFs, especially in scholarly papers. Therefor…
-
```
C:\Users\user>curl -X POST "http://localhost:8000/ocr" -F "file=C:\Users\user\Downloads\Telegram Desktop\0a6ce636600e1723a71ff95348cd07abdc035118.pdf" -F "strategy=marker" -F "ocr_cache=true"
{"…
-
I'm using windows and vue.js
`Audiveris stderr: Error opening data file C:\Program Files\tesseract-ocr\tessdata/deu.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to …
-
Hello,
I have successfully applied the OCR tool to my PDFs and set the "Save output as a PDF with text layer" option in the setting. However, when opening the OCR pdf output file, it was mentioned …