ncatlab / nlab

Source code for the nLab
https://ncatlab.org
150 stars 16 forks source link

Diagram renderer: restrict SVG extraction to first page in PDF #24

Closed sattlerc closed 2 years ago

sattlerc commented 2 years ago

@distler reports here:

Instiki does not support multiple pages in SVG; I don't know whether nlab does. pdftocairo -svg emits multipage svg documents under certain circumstances.

Testing with a two-page PDF, pdftocario -svg indeed generates a page set with multiple pages. I was able to suppress this behaviour by passing additional options -f 1 -l 1. This restricts the conversion to the first page of the PDF, mirroring the behaviour of pdf2svg.

Perhaps the renderer should also:

distler commented 2 years ago

Here's a simple testcase:

\begin{tikzpicture}a\end{tikzpicture}

Compare the output using pdf2svg versus pdftocairo -svg.

sattlerc commented 2 years ago

I can't reproduce this (EDIT: see next post). Test file:

\documentclass{article}
\usepackage{tikz}

\begin{document}
\begin{tikzpicture}a\end{tikzpicture}
\end{document}

Output of pdf2svg (0.2.3):

<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="612" height="792" viewBox="0 0 612 792">
<defs>
<g>
<g id="glyph-0-0">
</g>
<g id="glyph-0-1">
<path d="M 2.9375 -6.375 C 2.9375 -6.625 2.9375 -6.640625 2.703125 -6.640625 C 2.078125 -6 1.203125 -6 0.890625 -6 L 0.890625 -5.6875 C 1.09375 -5.6875 1.671875 -5.6875 2.1875 -5.953125 L 2.1875 -0.78125 C 2.1875 -0.421875 2.15625 -0.3125 1.265625 -0.3125 L 0.953125 -0.3125 L 0.953125 0 C 1.296875 -0.03125 2.15625 -0.03125 2.5625 -0.03125 C 2.953125 -0.03125 3.828125 -0.03125 4.171875 0 L 4.171875 -0.3125 L 3.859375 -0.3125 C 2.953125 -0.3125 2.9375 -0.421875 2.9375 -0.78125 Z M 2.9375 -6.375 "/>
</g>
</g>
</defs>
<g fill="rgb(0%, 0%, 0%)" fill-opacity="1">
<use xlink:href="#glyph-0-1" x="303.133" y="702.635"/>
</g>
</svg>

Output of pdftocairo -svg (22.06.0) is identical:

<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="612" height="792" viewBox="0 0 612 792">
<defs>
<g>
<g id="glyph-0-0">
</g>
<g id="glyph-0-1">
<path d="M 2.9375 -6.375 C 2.9375 -6.625 2.9375 -6.640625 2.703125 -6.640625 C 2.078125 -6 1.203125 -6 0.890625 -6 L 0.890625 -5.6875 C 1.09375 -5.6875 1.671875 -5.6875 2.1875 -5.953125 L 2.1875 -0.78125 C 2.1875 -0.421875 2.15625 -0.3125 1.265625 -0.3125 L 0.953125 -0.3125 L 0.953125 0 C 1.296875 -0.03125 2.15625 -0.03125 2.5625 -0.03125 C 2.953125 -0.03125 3.828125 -0.03125 4.171875 0 L 4.171875 -0.3125 L 3.859375 -0.3125 C 2.953125 -0.3125 2.9375 -0.421875 2.9375 -0.78125 Z M 2.9375 -6.375 "/>
</g>
</g>
</defs>
<g fill="rgb(0%, 0%, 0%)" fill-opacity="1">
<use xlink:href="#glyph-0-1" x="303.133" y="702.635"/>
</g>
</svg>

Output of pdftocairo -svg (21.01.0) running on CentOS 9 (different machine):

<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="612pt" height="792pt" viewBox="0 0 612 792" version="1.2">
<defs>
<g>
<symbol overflow="visible" id="glyph0-0">
<g transform="matrix(1,0,0,1,0,-7)">
<rect x="2" y="1" width="1" height="1"/>
<rect x="2" y="2" width="1" height="1"/>
<rect x="2" y="3" width="1" height="1"/>
<rect x="2" y="4" width="1" height="1"/>
<rect x="2" y="5" width="1" height="1"/>
<rect x="2" y="6" width="1" height="1"/>
</g>
</symbol>
</g>
</defs>
<g id="surface1">
<g style="fill:rgb(0%,0%,0%);fill-opacity:1;">
  <use xlink:href="#glyph0-0" x="303.133" y="702.635"/>
</g>
</g>
</svg>
sattlerc commented 2 years ago

I should have used the actual LaTeX file used by the diagram renderer:

\documentclass[class=scrartcl]{standalone}
\KOMAoptions{fontsize=12pt}
\usepackage{tikz}
\usepackage{amsmath,amssymb,amsthm,mathtools}

\begin{document}
\begin{tikzpicture}a\end{tikzpicture}
\end{document}

Now pdflatex produces a weird PDF file with 2 pages. I think the a is ignored, using an empty tikzpicture (\begin{tikzpicture}\end{tikzpicture}) produces the same PDF.

When I run pdf2svg (0.2.3), I get an empty SVG as expected (but with strange width):

<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="416.693" height="0" viewBox="0 0 416.693 0">
</svg>

When I run pdftocairo (22.06.0), I get an error because the height is 0:

Internal Error: cairo context error: invalid matrix (not invertible)<0a>
cairo error: invalid matrix (not invertible)

I think it is sensible to reject empty diagrams.

sattlerc commented 2 years ago

Looks like a bug in the standalone class or TikZ. It produces a PDF with two pages for an empty TikZ picture.

sattlerc commented 2 years ago

Possibly related: https://github.com/pgf-tikz/pgf/issues/724