-
Hey there, I'm running pd3f on a workstation and accessing via SSH tunnel to my local machine.
I'm using the browser GUI to (hopefully 🤞) OCR a book scan with several hundred pages. The book scan …
-
-
First of all: pd3f is working great!
It's a wonderful tool. Thank you so much for creating it.
There's this small issue though, that text blocks / columns aren't recognized as such. So articles wr…
-
## Type
- [ ] General question or discussion
- [X] Propose a brand new feature
- [ ] Request modification of existing behavior or design
## What is the problem that your feature request …
-
Solved it after 5 hours only due to https://github.com/pd3f/dehyphen/issues/2 tip.
This code works:
```
import sys
from dehyphen import FlairScorer, text_to_format
import os
def flatten_li…
-
-
Don't push pd3f-ocr and the dashboard separately to Docker hub. Instead, built the images (via inheritance + small adjustments) when running `docker-compose up` the first time.
-
I got this error when I tried to convert a PDF:
```
INFO:root:setting up ocr
INFO:root:ocr finished successfully
INFO:pd3f.parsr_wrapper:sending PDF to Parsr
INFO:pd3f.parsr_wrapper:got respons…
-
Hi, after I did a couple of successful convert, the pd3f has ran into problem of getting stuck in "INFO:root:setting up ocr" forever. Even after I ran `docker-compose run --rm worker rqscheduler --hos…
-
language: de
[ ] fast (but less accurate)
[ ] extract tables
[x] deduplicate page header/footer & transform footnotes to endnotes (experimental)
The textoutput of PDFs using docker-compose is
…