mgieseki / dvisvgm

A fast DVI, EPS, and PDF to SVG converter
https://dvisvgm.de
GNU General Public License v3.0
295 stars 28 forks source link

pdffile special: llx, lly, urx, ury options behave differently than with PSfile for cropped graphics #231

Closed agrahn closed 1 year ago

agrahn commented 1 year ago

When I embed a cropped PDF graphic file (i.e. non-zero llx and lly in /MediaBox [llx lly urx ury]), then the boxing options llx, lly, urx and ury in the pdffile special are differently interpreted than within the PSfile special.

With pdffile it seems that they are measured relative to the llx and lly coordinates of the embedded PDF file's MediaBox rectangle, while they are considered as absolute coordinates in the PSfilespecial, irrespective of the %%BoundingBox: ... setting in the EPS file to be included.

It would be desirable if the pdffile special and its boxing options would behave in the same way as the PSfile special, that is, as absolute coordinates. This would facilitate a consistent implementation of the graphicx/graphics driver file dvisvgm.def for graphics inclusion.

For demonstration, consider the following example. The original PS and PDF files, orig.(ps|pdf), with 0 0 150 150 bounding boxes are cropped to 25 25 125 125, which is the size of the rectangle between the filled central rectangle and the outer rectangle.

For the EPS, the correct inclusion special reads

PSfile="cropped.eps"  llx=25 lly=25 urx=125 ury=125 clip

It would be nice if that worked for the pdffile special in the same way. But with the current version 3.0.3 of dvisvgm, the embedded EPS and PDF look different in the SVG output. Typeset with latex and dvisvgm --zoom=-1 --bbox=papersize --font-format=woff2:

Input graphic files:

orig.pdf orig.eps cropped.pdf cropped.eps

\documentclass{article}
\usepackage[a6paper]{geometry}

\parindent=0pt

\begin{document}
\makebox[4em][l]{\rule{0pt}{100bp}PS:}%
x\special{PSfile="cropped.eps"  llx=25 lly=25 urx=125 ury=125 clip}\hspace{100bp}x

\vspace{5ex}

\makebox[4em][l]{\rule{0pt}{100bp}PDF:}%
x\special{pdffile="cropped.pdf" llx=25 lly=25 urx=125 ury=125 clip}\hspace{100bp}x
\end{document}
mgieseki commented 1 year ago

Unfortunately, this seems to be a bit more complicated. As far as I can tell, Ghostscript and probably also mutool transform the page coordinates so that the lower left corner of the MediaBox is always located at (0,0). As a result, clipping to the original MediaBox coordinates given in the PDF can lead to undesired results. I'll have a look if I can apply additional translations to adapt the behavior of psfile and pdffile. Since I haven't much time at the moment, this might take a few weeks, though.

agrahn commented 1 year ago

I am going to solve this at the TeX level. The dvisvgm.def graphics driver uses the extractbb command to retrieve the file's MediaBox/CropBox coordinates from the PDF and it is easy to perform the translations based on this. So maybe we can close this issue for now.

mgieseki commented 1 year ago

Ok, great. Thank you for taking care of this. One less issue I have to fix. 😃

agrahn commented 1 year ago

Even better: I found out that the correct way to crop PDF pages is to add a /CropBox entry with the new bbox rect coords to the page dictionary and to leave the /MediaBox entry untouched. The lower left coordinates of the then visible area do not get translated to (0,0) by gs. As a result, pdffile behaves as expected when evaluating the provided bbox options. Nothing needs to be changed/fixed.

I updated cropped.pdf accordingly, and typesetting the code example above now produces the desired result. Thus, #231 is indeed a non-issue. :tada: