-
### Value proposition
We should be scanning documents to be uploaded to the Lighthouse Benefits Documents API with clamav so that we can return an error to the user right away if there is an issue wi…
-
I'm removing watermarks from scanned documents and don't have control over the image size, and I'm getting this error:
image size: (2287, 1394, 3)
mask image size: (683, 1024, 3)
Image size not s…
-
I would like to try CascadeTabNet with my own images.
I used main.py and set the followings path:
image_path
xmlPath
config_fname
However I don't know what to set for these variables:
che…
zokai updated
3 years ago
-
First, essential topics to develop teaching materials for:
- [ ] git / GiHub
- [x] OS licensing
- [ ] reproducible open science
Ideas:
[Scanned Documents.pdf](https://github.com/gwu-libraries/OSPO…
-
The engine used for extraction of tables from PDF files is a well-known Python library called camelot. However, this library requires that the processed PDF file contains text ("computer" text, not ju…
-
This is an excerpt of my book (only one page, in Russian):
[Dolgan_ Язык норильских долган (Ubrjatova) 3.pdf](https://github.com/user-attachments/files/16060976/Dolgan_.Ubrjatova.3.pdf)
I put the …
-
@strabzounly
An error has occurred while executing Python code:
KeyError: 'scanned_cert_ref_column'
Traceback (most recent call last):
File "C:/Users/Administrator/.qgis2/python/plugins\st…
-
An option to inclued scanned documents to backup while exporting app's settings.
-
1) I was using llama parse cloud to read the content from the scanned image in pdf. Llama parse was able to decode the text from the scanned image.
2)But starting from today I see that , llama parse …
-
#### 问题描述 / Problem Description
PPStructure missing text that PaddleOCR do not miss
#### 运行环境 / Runtime Environment
- OS:
- Paddle:
- PaddleOCR:
#### 复现代码 / Reproduction Code
PaddleOCR(lang…