typemytype / drawbot

http://www.drawbot.com
Other
398 stars 61 forks source link

File size of pdf with image #510

Open mathieureguer opened 1 year ago

mathieureguer commented 1 year ago

I noticed placing a jpg in a DrawBot document usually results in a saved pdf with a file size quite larger than the image file itself.

# if necessary, image samples are available here: https://www.dropbox.com/sh/d8qho0unvf82z0l/AABkSAFMM_HGBhf1LPa-M9d-a?dl=0

import os

image_paths = [
    "drawbot.jpg",
    "drawbot.png",
    "drawbot-72ppi.jpg",
    "drawbot.tif",
    "drawbot-cmyk.jpg",
    "drawbot-cmyk.tif",
    ]

for image_path in image_paths:
    newDrawing()
    image_size = imageSize(image_path)
    image_weight_kb = os.path.getsize(image_path) / 1000
    image_resolution = imageResolution(image_path)

    pdf_path = image_path + ".pdf"
    newPage(*image_size)
    image(image_path, (0, 0))
    saveImage(pdf_path)

    pdf_weight_kb = os.path.getsize(pdf_path) / 1000
    print(f"{image_path} | {image_size[0]}x{image_size[1]} | {image_resolution} ppi | {image_weight_kb} kb —> {pdf_path} | {pdf_weight_kb} kb ")

Will print:

drawbot.jpg | 800x800 | 96.0 ppi | 211.735 kb —> drawbot.jpg.pdf | 357.782 kb 
drawbot.png | 800x800 | 96.0 ppi | 297.952 kb —> drawbot.png.pdf | 344.598 kb 
drawbot-72ppi.jpg | 800x800 | 72.0 ppi | 211.974 kb —> drawbot-72ppi.jpg.pdf | 357.782 kb 
drawbot.tif | 800x800 | 96.0 ppi | 1943.812 kb —> drawbot.tif.pdf | 358.084 kb 
drawbot-cmyk.jpg | 800x800 | 72.0 ppi | 906.108 kb —> drawbot-cmyk.jpg.pdf | 864.006 kb 
drawbot-cmyk.tif | 800x800 | 96.0 ppi | 3137.688 kb —> drawbot-cmyk.tif.pdf | 809.144 kb 

Experimenting with other image formats (tiff, png), it seems the resulting pdf is always roughly the same size. Does that mean something is converting the images to a specific format when exporting pdfs? I understand this is probably a Quartz thing and not a drawBot thing, but if there is indeed a conversion, is there a way to control its settings somehow? I am trying to have pdfs as small as possible and even tiny images bloat them very fast 😅

Weirdly enough, cmyk images tends to result in smaller size pdfs.

typemytype commented 1 year ago

I guess this is indeed a quartz issue. I have a hunch the screen you render on is important: could you check the resolution of the images inside the pdf? I assume the source is 72dpi, you have a retina screen and the final pdf has images with a higher resolution.

mathieureguer commented 1 year ago

Oh, that would be an interesting plot twist!

I am not sure how to test the screen influence on this. I have a lower resolution screen hooked up to my retina laptop but the output is the same no matter what screen I run drawBot on (maybe this makes sense, the retina screen must be considered the main screen by the system…).

You are right, I should have inspected the image inside the pdf. It gives a much better picture as to what is happening:

It looks like all the images are converted to 72 dpi, but with no resampling, as their dimension change accordingly. I don't think that is a problem, as I guess no information is lost and no data is added or removed here.

Looking further into the images in the pdfs, it seems they are all encoded with ZIP/Flate compression (FlateDecode) no matter the input image format (jpeg, tiff, pngs…).

Using a quartz filter to convert them to jpg brings the pdfs to a reasonable size, much closer to the original images files.

It would be awesome if dawBot pdfs could just keep the original compression (and resolution?) of the input images but I understand this is a quartz thing and it is probably not drawBot main focus anyway.

No image information is lost, so pdfs can be post processed outside of drawBot if compression is needed 🙂️