Transkribus / TranskribusSwtGui

Note: the repo has been moved to https://gitlab.com/readcoop/Transkribus/TranskribusSwtGui
GNU General Public License v3.0
18 stars 4 forks source link

PDF export - Keep original images for reduced file size #207

Closed storytracer closed 4 years ago

storytracer commented 6 years ago

The PDF export function (client + server alike) produces very large file sizes, since all images are converted into JPEG by default, even if they have been optimized beforehand.

For example, I mainly use binarized, monochrome TIFF files at 600 dpi, which average ~ 40 KB. But the PDF export converts them to JPEGs, which average ~ 550 KB.

For documents with many pages, this results in PDFs with sizes of several hundred MB, instead of a few MB. Furthermore, the quality of the images is reduced due to compression.

Would it be possible to introduce an option "Image type: Original" for the PDF export, as it currently exists in the METS export options?

I think it would not only make the use of PDF for users more practical, but also reduce traffic and CPU load for the Transkribus servers.

hackmanschorsch commented 6 years ago

makes very much sense

hackmanschorsch commented 6 years ago

will be available in release 1.4.3

storytracer commented 5 years ago

The issue persists. Although there is a dropdown menu available in the GUI now, selecting the "Original" option does not have any effect for me. Example of one document: my binarised input TIFF files have a combined size of 1.42 MB, but the PDF export (it does not matter whether server or client export) is 25.8 MB big.

hackmanschorsch commented 5 years ago

Problem is that internally the image is written to a ByteArrayOutputStream as 'jpeg'. Solution: we directly access the image (either 'original' or 'viewing')

hackmanschorsch commented 4 years ago

latest Snapshot contains a solution for this problem.