vega / vl-convert

Utilities for converting Vega-Lite specs from the command line and Python
BSD 3-Clause "New" or "Revised" License
89 stars 9 forks source link

PDF support #97

Closed jonmmease closed 10 months ago

jonmmease commented 10 months ago

Closes #91

Overview

This PR adds dependency-free PDF export support to VlConvert. It's been a journey to get to this point, but I'm really happy with the end result.

How it work

This PR uses VlConvert's SVG export path and then converts the resulting SVG image to a PDF. The bulk of the work is done by the wonderful svg2pdf crate. svg2pdf relies on usvg to convert the original SVG image to a simplified collection of paths, and then converts these paths to PDF.

Text

It's possible to render text using svg2pdf by using usvg to convert text to paths before the SVG tree is passed to svg2pdf. But this approach is suboptimal as the resulting text cannot be selected or searched in a PDF viewer like Adobe Acrobat. I opened an svg2pdf issue in January to talk about embedding text. The typst team (who developed svg2pdf, and the pdf-writer crate it depends on) have been really helpful through this process.

It turned out to be possible to accomplish text embedding on top of svg2pdf without changes to the core library. This PR uses pdf-writer to construct a new PDF document and then uses svg2pdf to convert everything in an SVG file except text to a PDF XObject. Then it traverses the SVG tree again and overlays PDF text on top of the XObject.

The logic for using pdf-writer to embed fonts in the resulting PDF file was taken from the typst project repository. It would be nice to eventually find a way to avoid duplicating this logic, but the duplication is worth it for the time being.

Testing

This logic is tested from Python using pdfium2 to convert the PDF to a PNG image and comparing to our existing PNG baselines. The comparison tolerance needs to be a little larger due to the slight differences in text rendering between pdfium and resvg, but they still match really well!

TODO

domoritz commented 10 months ago

Very cool.