Open cderv opened 1 year ago
I have found it useful to have two types of generated HTML when creating dual-format webpage-and-PDF documents: 1) "fast print preview" HTML (i.e. #9505) vs 2) live end-reader HTML.
The "fast print preview" HTML is given to WeasyPrint and only viewed in a full web browser during authoring for the benefit of the author. The live end-reader HTML is posted online and NOT given to WeasyPrint even though it has the same author's main content as the fast print preview.
I'm not sure how this best fits in with future Quarto, but I can say that I find myself running pandoc in three different modes: A) generate the live end-reader HTML B) generate fast print review HTML, but skip the PDF C) generate the PDF (using the same HTML as the fast print preview)
Thanks for your feedback @castedo !
Hi @cderv,
Here is what I have to do to make HTML to PDF engines (weasyprint
, pagedjs-cli
) working :
format: pdf
pdf-engine: weasyprint
fig-format: 'svg'
in execute
(or png
) always_allow_html: yes
(Should have been prefer-html: true
instead but this doesn't seem to work strangely)knitr::opts_knit$set('rmarkdown.pandoc.to' = 'html')
gt
CSS or hacking flextable
to make that worktemplate-partials
doesn't work (it's waiting for .tex
files)Everything can be found here https://github.com/kantiles/quarto.report/blob/main/template.qmd and here https://github.com/kantiles/quarto.report/blob/main/_extensions/quarto.report/_extension.yml. There is also a Python based template
As we talked, I agree that just generating an HTML with Quarto (and removing default style) then post-processing would be a better option
I came across this issue looking for something similar to the webpdf
option in JupyterLab, which essentially converts to HTML and then prints that page using playwright
. I saw that playwright
and similar options were discussed in the context of revealjs output in https://github.com/quarto-dev/quarto-cli/issues/4677 and wonder if you would consider also adding it as an output format for other document types such as notebooks.
An advantage of using such "html-printing" methods is that they work well with web-based plotting packages. Currently, workarounds such as alternations to the notebook rendering of charts is required to export output from visualization packages such as altair, plotly, and bokeh. Trying the wkhtmltopdf options in pandoc (#222) does not seem to fix these issues as charts are still not shown (only tested with altair), so it would be convenient with a "webpdf-like" option that supports this natively and avoid running into issues such as https://github.com/quarto-dev/quarto-cli/issues/10571 and https://github.com/quarto-dev/quarto-cli/discussions/916
something similar to the webpdf option in JupyterLab
As I just learnt about this, putting reference here. This is a nbconvert
feature relying on playwright
https://nbconvert.readthedocs.io/en/latest/usage.html#convert-webpdf
Thanks for sharing @joelostblom !
I think in the "html to pdf" world there is two main options:
HTML printing of webpage using a css for print. This usually use chromium printing tooling to get a PDF version of what is in the browser
HTML for paginated output. This aims to use tools like paged.js
or other that offers CSS domain to create with a paginated content (meaning a multi page website as a single multi page report), and then use HTML printing to PDF of this content.
I agree with you that having a --to pdf
using HTML as intermediate content would be a good addition to current options which are --to pdf
using LaTeX to get the PDF, and --to typst
using Typst to get the PDF.
Pandoc does have a simple way to print PDF (https://pandoc.org/MANUAL.html#creating-a-pdf)
--pdf-engine controls the behavior. Default is to look at the output format, a specific method will be used.
Quarto does not really work that way - this currently limit the way to create a PDF, and this is not really consistent across format.
In Quarto we support
format: pdf
which assumes LaTeX or ConTeXt enine. Andformat: typst
which will produce PDF file by default.There is special
output-ext
that can modify PDF render.output-ext: tex
whenformat: pdf
renders a.tex
file. Andoutput-ext: typ
whenformat: typst
render a.typ
file. But this variableoutput-ext
is fragile (try settingoutput-ext
to anything it will not error)When typst was introduced,
format: typst
has been created but default to render a PDF file.output-ext
can be use to get the.typ
.HTML printing to PDF is not really supported in Quarto (https://github.com/quarto-dev/quarto-cli/issues/222).
Using
format: latex
does not exactly work the same asformat: typst
as it won't produce PDF.We should probably rethink all this, and offer more mechanism to create PDF according to method available like Pandoc allows
quarto print
)Related Issues / Discussions
--to latex
when usingformat: pdf
(https://github.com/quarto-dev/quarto-cli/issues/6613#issuecomment-1693397040, https://github.com/quarto-dev/quarto-cli/issues/7966)