issues
search
internetarchive
/
archive-pdf-tools
Fast PDF generation and compression. Deals with millions of pages daily.
https://archive-pdf-tools.readthedocs.io/en/latest/
GNU Affero General Public License v3.0
86
stars
13
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Recode does not merge hocr into pdf
#69
jcuenod
opened
9 months ago
6
Fix pdfrenderer.py reference
#68
tfmorris
closed
11 months ago
5
A user-friendly example for a scanned multipage PDF needed
#67
FilipDominec
opened
1 year ago
3
A certain PDF from Archive.org does not display all of its contents on Mac OS
#66
EngineersNeedArt
closed
1 year ago
26
Q: accessible tagging/hints?
#65
jrochkind
closed
1 year ago
4
Installing on MacOS?
#64
jrochkind
closed
11 months ago
29
HOCR rendering compares unfavorably with tesseract PDF text layer
#63
jrochkind
opened
1 year ago
11
Additional apt packages needed to build current jbig2enc on Ubuntu 22.04
#62
jrochkind
closed
1 year ago
1
IndexError: list index out of range (single TIFF file)
#61
jrochkind
closed
1 year ago
5
First recode_pdf test: 'numpy' has no attribute 'int'.
#60
dwids
closed
1 year ago
5
Wrong resolution of mask image when foreground image is downsampled
#59
JoeLoginIsAlreadyTaken
opened
1 year ago
1
Update requirements.txt
#58
Redsandro
closed
1 year ago
1
Fix an error and a warning reported by LGTM
#57
stweil
opened
2 years ago
1
Fix it's => its in documentation
#56
stweil
closed
2 years ago
1
pdfcomp: problems with inverted text that is often better in hocr.
#55
rmast
opened
2 years ago
10
The choice for inverting, what's the use for perc_larger?
#54
rmast
opened
2 years ago
0
correct ratio determination for noise estimation
#53
rmast
opened
2 years ago
5
Bug in foreground/background separator choosing massive block instead of character outline.
#52
rmast
opened
2 years ago
14
pdfcomp: new tool, discussion, compression questions
#51
MerlijnWajer
opened
2 years ago
19
Missing test suite?
#50
mara004
opened
2 years ago
1
Upgrade GitHub Actions
#49
cclauss
closed
2 years ago
3
Create better presets for users with quality-comparable options for openjpeg/grok/pillow and kakadu
#48
MerlijnWajer
opened
2 years ago
1
Define scope of tooling and work to improve for that scope
#47
MerlijnWajer
opened
2 years ago
0
Detect if RGB images in pages are greyscale or even 1bit
#46
MerlijnWajer
opened
2 years ago
0
Some scans become inverted
#45
Redsandro
closed
2 years ago
7
Update README add installation instructions
#44
Redsandro
closed
2 years ago
9
Need some inspiration?
#43
rmast
opened
2 years ago
7
pillow is not working properly
#42
Redsandro
opened
2 years ago
27
openjpeg is not working properly
#41
Redsandro
closed
2 years ago
43
Update README fix typo
#40
Redsandro
closed
2 years ago
2
Fix setup by reading the version file manually
#39
mara004
closed
2 years ago
1
Improve setup configuration (see #36)
#38
mara004
closed
2 years ago
6
Just some other errors with the current version. I can't get the current version to work with a hocr-file coming from pdftree to get out the current searchable text from a PDF
#37
rmast
closed
2 years ago
18
master file contents.rst not found during build of docs
#36
rmast
closed
2 years ago
8
--jbig2 deprecated
#35
rmast
closed
2 years ago
1
License (in)compatibility
#34
rmast
opened
2 years ago
4
Usefulness of MRC for decent quality compression of scanned book pages with illustrations
#33
fusefib
opened
2 years ago
42
I don't understand this picture
#32
rmast
opened
2 years ago
11
Small difference in compressionratio
#31
rmast
opened
2 years ago
9
Error with hocr-files from Tesseract
#30
rmast
closed
2 years ago
25
Support pillow jpeg2000 writing
#29
MerlijnWajer
closed
2 years ago
3
Support recompressing existing PDFs without hOCR files and without touching the text input
#28
MerlijnWajer
opened
2 years ago
0
Use (not yet released) pdf->hocr conversation to improve compression for existing PDFs
#27
MerlijnWajer
opened
2 years ago
2
Lot of fuzz in background picture
#26
rmast
opened
2 years ago
36
Add --best flag?
#25
MerlijnWajer
opened
2 years ago
2
Run noise estimation on a part of the image
#24
MerlijnWajer
closed
2 years ago
1
Support hOCR ocr_photo / ocr_image element
#23
MerlijnWajer
opened
2 years ago
0
Windows port
#22
MerlijnWajer
closed
2 years ago
19
Add option to disable jbig2
#21
MerlijnWajer
closed
2 years ago
1
Add option (and heuristic) to treat the background as 'just plain (white) paper' for further optimisations
#20
MerlijnWajer
opened
2 years ago
0
Next