yuzutech / kroki

Creates diagrams from textual descriptions!
https://kroki.io
MIT License
2.83k stars 211 forks source link

Add support for PGF/TikZ diagrams #5

Closed ggrossetie closed 1 year ago

ggrossetie commented 5 years ago

https://en.wikipedia.org/wiki/PGF/TikZ

wenerme commented 1 year ago

Here is an php impl for reference


Trying to do this in js

export async function tikz2svg(s: string) {
  const id = randomUUID();
  const tmp = path.join(os.tmpdir(), await fs.mkdtemp('tikz2svg'), id);
  const tex = path.join(tmp, `${id}.tikz`);
  const pdf = path.join(tmp, `${id}.pdf`);
  const svg = path.join(tmp, `${id}.svg`);

  try {
    return await within(async () => {
      await fs.mkdirs(tmp);
      cd(tmp);
      await fs.writeFile(tex, s);
      await $`timeout 5 pdflatex -interaction=batchmode -halt-on-error ${tex}`;
      await $`timeout 1 pdf2svg ${pdf} ${svg} 1`;
      return await fs.readFile(svg, 'utf8');
    });
  } finally {
    await fs.rm(tmp, { recursive: true });
  }
}
rfdonnelly commented 1 year ago

Hi @ggrossetie. I have an initial implementation for TikZ support. If you are interested, I can submit a PR and make whatever changes you'd like to get it merged.

A preview of the changes is available at https://github.com/yuzutech/kroki/compare/main...rfdonnelly:kroki:tikz

This is based on my TikZ to SVG solution at https://github.com/rfdonnelly/docker-tikz2svg

Example

Example input:

https://github.com/rfdonnelly/docker-tikz2svg/blob/main/examples/complete-graph/input.tex

Example URL:

http://localhost:8000/tikz/svg/eNrNU8uO00AQvPsr-hIpkWJ7EzYIwmYlhIQEh5WQ9kScw9ju2EPmYWbaG0zkL9rP4MeY8WNJxEPigvDFVnV3dXW1ewJvjZZrKIkqu45jwi9MVgIjhRQTP3yNB8DGmfZvwrAwrCrjIJjAaxhB6EAP1VRqs4YPNVfEFbxHpsI7_e1RuOCNb-O6SM24jbhkRbQ3sUSZGtfgk09VGkX0uS-Ob4Mk11ktUVEmmLUnS0zlTGiFbZDUFiuWHViBJ6-0R_yX4KlhptmSrhiVdjcorRWhmYPlKkO454ePwC0oTZAJfEADqHRdlDBtkGZAGkrXS6ArZibl5BmBqUIg2MYSShsFicJjx5vIpnsHSYoFV6dRdTsCXlbFM6oNtlvHpexeGwm2ZBXuAoAJlQiyFsQrwTNGXCs4ciphLzQjOyqttLU8deuB-7K28A6sSyfwxULrCpzhdNSRI0wcP7KshETVMnXTudBpMY-iaP6iPbkE_0zgjdtf7WzpJ1sPuCsfBtoM5T8CLH9gzsAxAdIGwoVvOKhvziPXqz8WXg3RROkct7lhx3nGjVvHnCvlRLkFb66i5SqTO5jehYOWGTCCaeKGHrnWq-g6kzM4ta86xvZ3BrzsDFg8_28cWC6j1b82of8LFqsnE_5y2MUviFnHfC7n0unEX-KFfMwL3Ia38xRVDoYXJW2enQ_IZtDn3IRdziBJ4P6nRD9vG7hbc2kXl9YjT8f4Ha08o14=

Example Result:

image

Approach

Convert a TikZ picture to an SVG using the following steps:

  1. Pass user LaTeX document to latex CLI to generate a DVI file
  2. Pass generated DVI file to dvisvgm CLI to generate an SVG file

Design Decisions

Caveats

Implementation

Unresolved Issues

Additional TODOs

ggrossetie commented 1 year ago

Wow, thank you for the detailed analysis/work.

TikZ support has been added to the Gateway Server Docker image instead of a Companion Server Docker image. This was done to simplify the intial implementation. It might make more sense to move this in a Companion Server instead.

I think it's fine if:

Dvisvgm is used to convert DVI to SVG instead of pdf2svg or another solution. This decision was largely informed by tex.stackexchange.com/questions/51757/how-can-i-use-tikz-to-make-standalone-svg-graphics.

I'm not familiar with dvisvgm nor pdf2svg so I don't really know which one is best. Having said that, it seems that is more suited for TeX/LaTeX users than pdf2svg (which is a more generic tool). So I would say 👍🏻

The user supplied LaTeX should use the "standalone" package to get well formed output otherwise a figure caption will be included. Kroki documentation should provide guidance here.

Does it mean that the conversion will fail to produce a valid SVG (or even return an error code)?

Creating diagrams using LaTeX/PGF/TikZ is made possible by a rich set of TeX packages. Users are limited to the TeX packages provided by the Alpine Linux texmf packages. Users cannot use arbitrary packages one might find on CTAN. The Alpine Linux texmf packages provide many of the popular TeX packages. A subset of the Alpine Linux texmf packages are provided by this solution.

I think that's a reasonable tradeoff.

TikZ version - TikZ::getVersion() returns "TODO". I'm not sure what to put for version since this utilizes two projects (latex and dvisvgm) and makes available several TeX packages via the Alpine Linux texmf packages.

😬 Again, I'm not familiar with the TeX ecosystem but it seems that TikZ is built upon https://github.com/pgf-tikz/pgf and pgf has releases/versions. Maybe we can use the pgf version?

The versions of both should probably be included.

Why not, I'm also fine using only tikz/pgf version.

Build time for dvisvgm - It takes some time to build dvisvgm from source. This increases the build time for the Docker image. It may be desireable to prebuild this stage like what was done for erd.

Definitely 👍🏻

rfdonnelly commented 1 year ago

Thanks for all the feedback. I thought I should dig into the various solutions for converting PGF/TikZ to SVG before submitting the PR. I have some interesting findings. In short, there is no ideal solution. Each has tradeoffs. I have narrowed it down to two solutions: dvisvgm and pdftocairo.

I've documented my findings at https://github.com/rfdonnelly/docker-tikz2svg/blob/main/COMPARISON.adoc.

Dvisvgm handles text better since it can embed fonts and it produces the smallest SVG file size. However, it relies on deprecated behavior in GhostScript which will be removed from the next version of GhostScript. And it is currently producing SVG that misrenders the focused ion beam system example. But this is likely a regression bug or an environment issue since the dvisvgm FAQ shows a correct render of this example. I plan to file issues on dvisvgm for the GhostScript issue and the misrender issue.

dvisvgm 62.8KB (misrender but correct scale) pdftocairo 1.4MB (correct render but wrong scale)

Pdftocairo is recommended over pdf2svg by the author of pdf2svg themselves. It is part of the Poppler project AND is available in the poppler-utils Alpine Linux package. It supports all the major outputs that Kroki supports AND it renders all my examples nicely. Two negatives though. It converts all text to paths which prevents text selection/copy in browsers and increases file size. AND it seems to have image scaling bugs. The SVG is smaller than it should be and the PNG is larger than it should be. The SVG file size could be reduced with an SVG optimizer and I might be able to workaround the SVG image scaling issue. I plan to file issues on poppler for the image scaling issues.

Considering all of this, I think pdftocairo is the solution to go with for now. Alternate backends (e.g. dvisvgm) could be added or substituted in the future if/when the issues are resolved.

rfdonnelly commented 1 year ago

Update

dvisvgm vs pdftocairo

The dvisvgm "misrender" has been resolved. See https://github.com/mgieseki/dvisvgm/issues/225 for more info. And I think the dvisvgm dependency on the deprecated GhostScript functionality only affects the dvisvgm PDF to SVG pipeline and does not affect the DVI to SVG pipeline. But I've asked the dvisvgm maintainer for clarification. Assuming the GhostScript deprecation is a non-issue for DVI to SVG, then I'll move forward with a latex+dvisvgm implementation for converting PGF/TikZ to SVG.

Support for PGF/TikZ to JPEG, PNG, and PDF can be added as separate PRs implemented using latex+pdftocairo for JPEG and PNG and just latex for PDF.

Diagram Source

The input to latex (i.e. diagram source) will need to differ for SVG vs JPEG,PNG,PDF. For SVG, it needs to specify the PGF/TikZ dvisvgm backend like:

example-svg.tex

\documentclass[dvisvgm]{standalone}
% The rest of the input follows ...

While JPEG,PNG,PDF can use the default PGF/TikZ backend like this:

example-pdf.tex

\documentclass{standalone}
% The rest of the input follows ...

I'm not sure of the best way to handle this. Kroki could require to the user to do this or Kroki could insert this line for the user. I'm not a [La]TeX expert so I don't know what the implications are of having Kroki insert the documentclass. Requiring the user to do it would future-proof things but could then lead to users getting it wrong and getting unexpected results.

I found that asciidoctor-diagram also supports TikZ and they choose to provide this line (among others) instead of requiring the user to do so.

They provide a template that looks like this:

\documentclass[border=2bp, tikz]{standalone}
\usepackage{tikz}
% INSERT USER DIAGRAM OPTIONS HERE
\begin{document}
\begingroup
\tikzset{every picture/.style={scale=1}}
% INSERT USER DIAGRAM SOURCE HERE
\endgroup
\end{document}

See https://github.com/asciidoctor/asciidoctor-diagram/blob/62f611df7966776eb5403691f351d350eb6c0e81/lib/asciidoctor-diagram/tikz/converter.rb#L42-L55

I don't really like this approach because I feel it requires too much conversion trying to copy and paste a random PGF/TikZ example found on the internet into the format that asciidoctor-diagram wants. It also locks asciidoctor-diagram into a more constrained UI.

Maybe we could grep for the \documentclass line in the user-provided diagram source. If the user didn't provide one, then we infer one based on the desired output type and insert it into the diagram source.

ggrossetie commented 1 year ago

Maybe we could grep for the \documentclass line in the user-provided diagram source. If the user didn't provide one, then we infer one based on the desired output type and insert it into the diagram source.

Sounds reasonable. Do you know if \documentclass is mandatory or optional?

We should also make sure that we don't overwrite user-defined value in the brackets, i.e.: \documentclass[border=2bp] should produce \documentclass[border=2bp, dvisvgm]. And, also, check that dvisvgm won't conflict with other options such as tikz.

rfdonnelly commented 1 year ago

Do you know if \documentclass is mandatory or optional?

My LaTeX experience is shallow and a long time ago. I think it is required if you have a preamble (stuff before \begin{document}). Which most if not all PGF/TikZ diagrams will have.

We should also make sure that we don't overwrite user-defined value in the brackets, i.e.: \documentclass[border=2bp] should produce \documentclass[border=2bp, dvisvgm]. And, also, check that dvisvgm won't conflict with other options such as tikz.

I should clarify. I'm planning to only insert an inferred \documentclass command if the user-supplied document doesn't contain one. Otherwise, I'll just use the user-provided \documentclass as is. This is simple to implement in shell script and allows users to opt-out if necessary.

Looking back through this thread, I realize I didn't respond to some of your earlier questions...

The user supplied LaTeX should use the "standalone" package to get well formed output otherwise a figure caption will be included. Kroki documentation should provide guidance here.

Does it mean that the conversion will fail to produce a valid SVG (or even return an error code)?

If the "standalone" documentclass is not used, users will still get an image but it will include a figure caption.

Maybe we can use the pgf version?

I didn't realize PGF/TikZ had a version. Makes sense. I'll do this.

rfdonnelly commented 1 year ago

Here's a recipe for obtaining the PGF/TikZ version:

echo '\\usepackage{tikz} \\message{^^Jpgfversion:\\pgfversion:}' > pgfversion.tex
latex pgfversion.tex | grep pgfversion | cut -d: -f2

Example output:

3.1.9a