johnwhitington / camlpdf

OCaml library for reading, writing and modifying PDF files
Other
200 stars 28 forks source link

Faster loading of basic data on large files #66

Open johnwhitington opened 8 months ago

johnwhitington commented 8 months ago

Whilst cpdf is generally fast, we are behind on simple operations on large files - perhaps by not delaying the reading of objects from object streams in some way?

Times for Forgotten_creator:

CPDF: 0.39s
QPDF: 0.04s
MUPDF: 0.17s
XPDF: 0.42s

cpdf -pages in.pdf
qpdf -show-npages in.pdf
mutool pages in.pdf
pdfinfo -box in.pdf

Times for all files in PDFTests/

CPDF:  1.36s
QPDF:  0.94s
MUPDF: 0.47s
XPDF: 0.66s

time find . -maxdepth 1 -type f -exec cpdf -pages {} \; > foo 2>&1
time find . -maxdepth 1 -type f -exec qpdf -show-npages {} \; > foo 2>&1
time find . -maxdepth 1 -type f -exec mutool pages {} \; > foo 2>&1
time find . -maxdepth 1 -type f -exec pdfinfo -box {} \; > foo 2>&1
johnwhitington commented 7 months ago

+monster_squeezed -pages:

cpdf 10.49s
qpdf 1.22s
mupdf 2.14s
pdfinfo 0.53s

And on full read and write:

cpdf 6m6s
qpdf 1m9s
johnwhitington commented 7 months ago

Now:

Times for Forgotten_creator:

CPDF: 0.13s
QPDF: 0.04s
MUPDF: 0.17s
XPDF: 0.42s

cpdf -pages in.pdf
qpdf -show-npages in.pdf
mutool pages in.pdf
pdfinfo -box in.pdf
But still slower on:

Times for all files in PDFTests/

CPDF:  1.31s
QPDF:  0.94s
MUPDF: 0.47s
XPDF: 0.66s