-
## Describe the bug
To overcome the problem of long pdfs taking too long to open, I use the PyPDF2 library to keep only the pages with a table that I need. Then I open the pdf that only contains a …
-
This includes three steps in the process:
1) Identifying the right tables with better heuristics. One simple idea is to use more and/or better filter terms. For now it's just merely one term, e.g. fo…
-
Hello,
I'm having some issues running your script, I'm using Ubuntu. I'm getting about half my folders and files converted and I get an incomplete. Error message is below. Is there any clean-up of the…
-
Hi,
When i loop over my PDFs and use OCR_Data, after a while (about 2 hours) it produces the following error:
TIFFReadEncodedStrip Error
---------------------------
Read error at scanline 0; got…
-
The main issue on he fileformat is, that it's not possible to embedded resources like PDFs, Audio Files, Images etc.
@LittleHuba now would start improving it, but I think we should make the plans p…
-
### ⚠️ This issue respects the following points: ⚠️
- [X] This is a **bug**, not a question or a configuration/webserver/proxy issue.
- [X] This issue is **not** already reported on Github _(I've …
-
[GgImagineBatch.20180514-0655.out.zip](https://github.com/gsautter/goldengate-imagine/files/1999414/GgImagineBatch.20180514-0655.out.zip)
this is the log of the parsing of a zootaxa.4419.1.1 97MB. …
-
For some reason, search doesnt include looking into it word that are written in italic, I have been looking into it, but i cant find the part where it goes wrong.
-
This week's primary objective is to provide the hydrologic context. The aim is for the analysts to have familiarity with the hydrologic cycle that includes: units, values (or magnitudes) of streamflo…
-
Dear:
I'm a developer from China. When I try to convent a adoc file into pdf, I met a strange issue. It didn't convent all Chinese words in pdf file, but just some of them.
Here below is my code:
``…