Kozea / WeasyPrint

The awesome document factory
https://weasyprint.org
BSD 3-Clause "New" or "Revised" License
7.08k stars 672 forks source link

Is initial zoom setting allowed? #1789

Closed macdeport closed 1 year ago

macdeport commented 1 year ago

I have just discovered WeasyPrint, then used it and already appreciate all its interest: thanks to Lucie @grewn0uille, to you and Kozea for giving us access to it.

Do the rich features already available allow the following settings, especially the initial zoom?

weasyprint-57 2--document-properties

liZe commented 1 year ago

I have just discovered WeasyPrint, then used it and already appreciate all its interest: thanks to Lucie @grewn0uille, to you and Kozea for giving us access to it.

😊

Do the rich features already available allow the following settings, especially the initial zoom?

It’s not supported, but it’s probably possible. The question is: can this information be specified using HTML or CSS? If there’s a specification for that, then the feature could be implemented in WeasyPrint. If it’s not, then it’s probably better to do this during a post-processing step, with a generic PDF edition library.

macdeport commented 1 year ago

Thank you for the openness of your response and the thoughts it suggests.

Do the rich features already available allow the following settings, especially the initial zoom?

Maybe I can propose an answer to this question: the philosophy of HTML is not to provide global formatting specifications like the zoom factor but on the other hand CSS has directives (I'm not sure of the exact term) like @page for the dimensions of the pages produced: the initial zoom value has some relationship?

Then this parameter could then be passed to the PDF rendering engine pydyf.

An alternative way could be a new write_pdf() parameter.

pbregener commented 1 year ago

I use below to achieve what you probably are looking for. Check out the available options as mentioned in the inline comment.

def set_viewer_prefs(_: weasyprint.Document, pdf: pydyf.PDF):
    # For options, see Table 28 in below file
    # https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf
    pdf.catalog['PageLayout'] = '/SinglePage'
    pdf.catalog['PageMode'] = '/UseNone'

Then you just need to use that method as a finisher (.write_pdf(finisher=set_viewer_prefs))

macdeport commented 1 year ago

@pbregener Thanks for pointing me in the right direction by highlighting the finisher, with documentation on the (complex) structure of a PDF document and an very instructive working example that I was able to use and even modify.

On the other hand, the PZ (Prefered Zoom) seems to be relative to each page of the PDF and therefore not definable in the catalog dictionary. So I have to understand more deeply...

macdeport commented 1 year ago

This ended up being the solution:

def set_viewer_prefs(_: weasyprint.Document, pdf: pydyf.PDF, zoom=1.25):
    # For options, see 7.7 "Document Structure" Table 28 (page 72) & 30 in below file
    # https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf
    # PageLayout => Disposition des pages
    pdf.catalog['PageLayout'] = '/SinglePage' 
    pdf.catalog['PageMode'] = '/UseThumbs' 
    pdf.catalog['OpenAction'] = pydyf.Dictionary({'D': pydyf.Array([pdf.pages.reference,
                                                                    '/XYZ',0,0, zoom]),
                                                  'S': '/GoTo',})

@pbregener Your crucial code was much appreciated.

pmjdebruijn commented 1 year ago

For future reference, a complete sample script (adjust to your own preferences of course):

#!/usr/bin/env python3

import sys
import pydyf
import weasyprint
import pathlib

# https://github.com/Kozea/WeasyPrint/issues/1789
def set_viewer_prefs(_: weasyprint.Document, pdf: pydyf.PDF):
    # For options, see Table 28 in below file
    # https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf
    pdf.catalog['PageLayout'] = '/TwoColumnRight'
    pdf.catalog['PageMode'] = '/UseOutlines'

def main():
    print(f'Rendering {sys.argv[1]}...')

    html = open(sys.argv[1], 'r').read()

    htmldoc = weasyprint.HTML(string=html, base_url="")

    pdfdoc = htmldoc.write_pdf(zoom=1, finisher=set_viewer_prefs, pdf_variant='pdf/ua-1', presentational_hints=True, optimize_images=True, jpeg_quality=90, dpi=300, hinting=True)

    print(f'Writing {pathlib.Path(sys.argv[1]).with_suffix(".pdf")}...')
    pathlib.Path(sys.argv[1]).with_suffix(".pdf").write_bytes(pdfdoc)

if __name__ == '__main__':
    main()

@liZe: I think there would be value in making this more easily accessible though, for example by making PageLayout/PageMode part of weasyprint.DEFAULT_OPTIONS, and supporting those options directly in the weasyprint cli