leo-colisson / robust-externalize

A LaTeX library to cache pictures (including tikz, python code, and more) in a robust, customizable, and pure way.
7 stars 2 forks source link

Clarify how rubost-externalize works regarding fonts and settings from main document #13

Closed dflvunoooooo closed 7 months ago

dflvunoooooo commented 7 months ago

Just to clarify how robustExternalize is working. It may be explained in the documentation.

If I set a font and other settings in the main document, are those settings carried to the cached file, or does not matter, since the generated code is run again via the main document? I was wondering, since one can externalize a latex part and has to provide some settings in the robustExternalize={} option. If I, for example, set a certain font in my main document, externalize a tikz figure and a python code which creates a figure too. Then robustExternalize creates those figures and saves them. Are those then processed by the main compilation process again and all settings are applied to it, so the font is the same?

tobiasBora commented 7 months ago

Hi,

I tried to document this already, it the section "How it works" in https://mirror.koddos.net/CTAN/macros/latex/contrib/robust-externalize/robust-externalize.pdf Can it be clarified further?

This library is conceptually simple: for pdf, by default, it generates a .tex file with a standalone class (it adds some margin by default to allow overlay) and the preambule that you configured in, e.g.

\robExtConfigure{
  add to preset={latex}{
    add to preamble={code to add to preamble}
  }
}

, it compiles this file using by default pdflatex, and it includes the pdf file with \includegraphics (trimming the margin added for overlay), so nothing magic like font replacement if you compile to a pdf. There are some subtleties to deal with dependencies and compute the depth of the content etc, so we actually need to write the depth in an additional file somehash-out.tex that we input before the includegraphics, but it is not really important. And in practice, we obtain the source/compilation command/include command by expanding three corresponding placeholders, that you can set to a different value in each preset, this is explained in the above section in the documentation.

But in practice, the code is quite easy to debug: for each cached stuff, we create in the robustExternalize folder 2 files, a somehash.deps and somehash.tex: the .deps contains in the first line the compilation command (you just need to replace the source placeholder with the actual .tex filename), but you can also see the exact compilation command in the log when you compile or in a file if you compile in manual mode. The .tex contains the source, and after the compilation the .pdf is the file that is included via includegraphics by default.

So you can see that font information is not forwarded to the picture. If you want to change the font in the picture, just use for instance add to preamble as described above, or use add to latex options={10pt} for instance. You can also change the __ROBEXT__MAIN_CONTENT__ placeholder for more advanced stuff, we use it to forward macros for instance.

dflvunoooooo commented 7 months ago

Thank you very much for that clarification! I imagined, that the main document parses all cached files, since they are .tex files, and therefore applies all settings in from main.

Is there the possibility to pass along my settings? I would like to be the font in the main document and the cached to be the same. And all settings for math environments as well. Otherwise I have to set everything twice.

Edit: Otherwise something like this happens. Screenshot_20231215_214755 The main document has a different font or even different font family than the figure. And that way Text or math in the picture is completely different from the main document. In a not cached for example tikzpicture the text is set with the options of the main document.

dflvunoooooo commented 7 months ago

All right, I can load all my packages with it's settings. But I can not set Font or Language. These options lead to problems:

\setmainfont{Libertinus Serif} 
\setsansfont{Libertinus Sans} 
\setmathfont{Libertinus Math} 
\setdefaultlanguage[spelling=new,latesthyphen=true]{german} 

Shall I open another issue for clarity?

Edit: For clarity, this is my file:

\documentclass{scrreport}

\usepackage{amsmath}                                % Unverzichtbare Mathe-Befehle.
\usepackage{tikz}                                   % Für Zeichnungen direkt in LaTeX

\usepackage{blindtext}

\usepackage{robust-externalize}
\setPlaceholder*{__ROBEXT_LATEX_ENGINE__}{xelatex}  % Für das robustExternalize-Paket.
\robExtConfigure{
    add to preset={latex}{
        add to preamble={\setmainfont{Libertinus Serif}},
    },
}

\begin{document}
\blindtext

\begin{figure}[h]
    \begin{CacheMe}{tikzpicture}[scale=1.2]
        % Achsen
        \draw[->] (-0.1,0) -- (8.5,0) node[below, align=center] {Anzahl der\\ Atome \(n\)};
        \draw[->] (0,-0.1) -- (0,3.1) node[left] {\(E\)};
        \node[below] at (0.5,0) {1};
        \node[below] at (1.5,0) {2};
        \node[below] at (2.5,0) {3};
        \node[below] at (4.5,0) {viele};
        \node[below] at (7,0) {\(>10^{23}\)};
    \end{CacheMe}
    \centering
    \caption[Aufspaltung der beliebigen Energieniveaus]{Aufspaltung der beliebigen Energieniveaus \(E_{n-1}\) und \(E_{n}\) bis sie in Bändern übergehen.}
    \label{tikz:niveauZuBändern}
\end{figure}

\end{document}

The error is:

LaTeX Info: File `robExt-remove-old-figures.py' already exists on the system.
            Not generating it from this source.

) (./test_gnuplotrobust-externalize.aux)
(/usr/local/texlive/2023/texmf-dist/tex/latex/base/ts1cmr.fd)
The file robustExternalize/robExt-4AA2D1E911846CBEC186E314EDCA9560.tex already 
exists.

[robExt]We will start the compilationusing: cd robustExternalize/ && xelatex -h
alt-on-error "robExt-4AA2D1E911846CBEC186E314EDCA9560.tex".This is XeTeX, Version 3.141592653-2.6-0.999995 (TeX Live 2023) (preloaded format=xelatex)
 restricted \write18 enabled.
entering extended mode
(./robExt-4AA2D1E911846CBEC186E314EDCA9560.tex
LaTeX2e <2023-11-01>
L3 programming layer <2023-12-11>
(/usr/local/texlive/2023/texmf-dist/tex/latex/standalone/standalone.cls
Document Class: standalone 2022/10/10 v1.3b Class to compile TeX sub-files stan
dalone
(/usr/local/texlive/2023/texmf-dist/tex/latex/tools/shellesc.sty)
(/usr/local/texlive/2023/texmf-dist/tex/generic/iftex/ifluatex.sty
(/usr/local/texlive/2023/texmf-dist/tex/generic/iftex/iftex.sty))
(/usr/local/texlive/2023/texmf-dist/tex/latex/xkeyval/xkeyval.sty
(/usr/local/texlive/2023/texmf-dist/tex/generic/xkeyval/xkeyval.tex
(/usr/local/texlive/2023/texmf-dist/tex/generic/xkeyval/xkvutils.tex
(/usr/local/texlive/2023/texmf-dist/tex/generic/xkeyval/keyval.tex))))
(/usr/local/texlive/2023/texmf-dist/tex/latex/standalone/standalone.cfg)
(/usr/local/texlive/2023/texmf-dist/tex/latex/base/article.cls
Document Class: article 2023/05/17 v1.4n Standard LaTeX document class
(/usr/local/texlive/2023/texmf-dist/tex/latex/base/size10.clo)))
! Undefined control sequence.
<recently read> \setmainfont 

l.2  \setmainfont
                  {Libertinus Serif} \usepackage {tikz}
No pages of output.
Transcript written on robExt-4AA2D1E911846CBEC186E314EDCA9560.log.
system returned with code 256

! Package robExt Error: The pdf file
(robExt)               
robustExternalize/robExt-4AA2D1E911846CBEC186E314EDCA9560.pdf
(robExt)                is not present. The compilation command "cd
(robExt)                robustExternalize/ && xelatex -halt-on-error
(robExt)                "robExt-4AA2D1E911846CBEC186E314EDCA9560.tex"" used
(robExt)                to compile the environment on line 30 certainly
(robExt)                failed, see logs above or in
(robExt)               
robustExternalize/robExt-4AA2D1E911846CBEC186E314EDCA9560.log.

And the error in said file:

! Undefined control sequence.
<recently read> \setmainfont 

l.2  \setmainfont
                  {Libertinus Serif} \usepackage {tikz} 
dflvunoooooo commented 7 months ago

Interestingly is the behavior different for a CaheMe tikz environment or a CacheMeCode gnuplot environment. The first does set the text with the settings added to the add to preset={tikz}{} option, as shown in the picture above. In acheMeCode gnuplot with tikz environment, the settings from the main document get applied. Even if I explicitly load a serif font in the add to preset={latex}{} option, the text is set with the sans serif font of the main document. As shown in the picture below.

Screenshot_20231216_090824

The last case is what I would expect.

This also leads to the fact, that any change in the add to preset={tikz}{} option will recompile every cached tikz figures, but any change in the add to preset={latex}{} will not recompile the cached gnuplot figures.

Edit: I can further confirm this with this document, where the dsfont package is loaded in the main document for the \mathds{N} command, but nothing is added to any rubustExternalipe preamble:

\documentclass[]{scrreport}

\usepackage{amsmath}                                % Unverzichtbare Mathe-Befehle.
\usepackage{tikz}                                   % Für Zeichnungen direkt in LaTeX

\usepackage{gnuplottex}                             % Um Gnuplot direkt in Latex erzeugen zu lassen
\usepackage{gnuplot-lua-tikz}                       % Für die tikz-Umgebung in gnplottex. Ermöglicht Export von Variablen

\usepackage{blindtext}
\usepackage{dsfont}
\usepackage{robust-externalize}
\setPlaceholder*{__ROBEXT_LATEX_ENGINE__}{xelatex}  % Für das robustExternalize-Paket.

\begin{document}
\blindtext

\begin{CacheMeCode}{gnuplot, tikz terminal={}}
set xlabel '\(\mathds{N}\)'
plot sin(x) 
\end{CacheMeCode}

\end{document}
dflvunoooooo commented 7 months ago

The error with \setmathfont{} and \setdefaultlanguage in the add to preset only applies to the latex preset, not the tikz preset. If added to tikz preset everything compiles perfektly.

Edit: And since the latex package seems to load all packages from the main document, this isn't relevant. Or is this behaviour not wanted?

tobiasBora commented 7 months ago

So you get an error only because you need \usepackage{fontspec}:

\documentclass{scrreport}

\usepackage{amsmath}                                % Unverzichtbare Mathe-Befehle.
\usepackage{tikz}                                   % Für Zeichnungen direkt in LaTeX

\usepackage{blindtext}

\usepackage{robust-externalize}
\robExtConfigure{
  add to preset={latex}{
    use xelatex,
    add to preamble={
      \usepackage{fontspec}
      \setmainfont{Libertinus Serif}
    },
  },
}

\begin{document}
\blindtext

\begin{figure}[h]
    \begin{CacheMe}{tikzpicture}[scale=1.2]
        % Achsen
        \draw[->] (-0.1,0) -- (8.5,0) node[below, align=center] {Anzahl der\\ Atome \(n\)};
        \draw[->] (0,-0.1) -- (0,3.1) node[left] {\(E\)};
        \node[below] at (0.5,0) {1};
        \node[below] at (1.5,0) {2};
        \node[below] at (2.5,0) {3};
        \node[below] at (4.5,0) {viele};
        \node[below] at (7,0) {\(>10^{23}\)};
    \end{CacheMe}
    \centering
    \caption[Aufspaltung der beliebigen Energieniveaus]{Aufspaltung der beliebigen Energieniveaus \(E_{n-1}\) und \(E_{n}\) bis sie in Bändern übergehen.}
    \label{tikz:niveauZuBändern}
\end{figure}

\end{document}

The tikz preset is exactly like latex (in the sense that it loads latex when you call it), except that it also loads \usepackage{tikz}, so I guess that in your case it works because tikz loads internally \usepackage{fontspec}.

Fully-automatically forwarding the current font by default is not so easy because there are so many ways to load a font in LaTeX (by loading a package like \usepackage{times}, by using xelatex or lualatex, by calling locally \rmfamily, by loading a different class… and here I'm not even mentioning math font where each symbol can use a different font), and this is even more complicated if the main file is compiled with a different engine (e.g. what should I do if the main document is compiled with lualatex but the inner cached images are compiled with pdflatex, that supports very few fonts by default?). For these reasons, it is simpler to simply let the user load fonts as they want.

But I can understand that you might prefer to avoid duplicating the code setting the font for instance. The simpler is certainly to load the very latest version I just pushed, where you can run:

\runHereAndInPreambleOfCachedFiles{
  \usepackage{fontspec}
  \setmainfont{Times New Roman}
}

to load a code here and in the preamble of cached files. Full code:

\documentclass{scrreport}

\usepackage{amsmath}                                % Unverzichtbare Mathe-Befehle.
\usepackage{tikz}                                   % Für Zeichnungen direkt in LaTeX

\usepackage{blindtext}

\usepackage{robust-externalize}

\runHereAndInPreambleOfCachedFiles{
  \usepackage{fontspec}
  \setmainfont{Times New Roman}
}

\robExtConfigure{
  add to preset={latex}{
    use xelatex,
  },
}

\begin{document}
\blindtext

\begin{figure}[h]
    \begin{CacheMe}{tikzpicture}[scale=1.2]
        % Achsen
        \draw[->] (-0.1,0) -- (8.5,0) node[below, align=center] {Anzahl der\\ Atome \(n\)};
        \draw[->] (0,-0.1) -- (0,3.1) node[left] {\(E\)};
        \node[below] at (0.5,0) {1};
        \node[below] at (1.5,0) {2};
        \node[below] at (2.5,0) {3};
        \node[below] at (4.5,0) {viele};
        \node[below] at (7,0) {\(>10^{23}\)};
    \end{CacheMe}
    \centering
    \caption[Aufspaltung der beliebigen Energieniveaus]{Aufspaltung der beliebigen Energieniveaus \(E_{n-1}\) und \(E_{n}\) bis sie in Bändern übergehen.}
    \label{tikz:niveauZuBändern}
\end{figure}
\end{document}

If the font is likely to change often, you can also create a new placeholder with set placeholder eval={__MY_FONT__}{\myCurrentFont} and automatically change \myCurrentFont when you change your font, either by patching \setmainfont or by creating a new macro \mysetmainfont:

\documentclass{scrreport}

\usepackage{amsmath}                                % Unverzichtbare Mathe-Befehle.
\usepackage{tikz}                                   % Für Zeichnungen direkt in LaTeX

\usepackage{blindtext}
\usepackage{fontspec}

% Create a new command \mysetmainfont which forwards the font to the picture.
\NewDocumentCommand{\mysetmainfont}{m}{%
  \def\myCurrentFont{#1}% Stores the name of the current font in \myCurrentFont
  \setmainfont{#1}%
}

% Use mysetmainfont from now to change font:
\mysetmainfont{Times New Roman}

\usepackage{robust-externalize}
\robExtConfigure{
  add to preset={latex}{
    use xelatex,
    set placeholder eval={__MY_FONT__}{\myCurrentFont},
    add to preamble={
      \usepackage{fontspec}
      \setmainfont{__MY_FONT__}
    },
  },
}

\begin{document}
\blindtext

\begin{figure}[h]
    \begin{CacheMe}{tikzpicture}[scale=1.2]
        % Achsen
        \draw[->] (-0.1,0) -- (8.5,0) node[below, align=center] {Anzahl der\\ Atome \(n\)};
        \draw[->] (0,-0.1) -- (0,3.1) node[left] {\(E\)};
        \node[below] at (0.5,0) {1};
        \node[below] at (1.5,0) {2};
        \node[below] at (2.5,0) {3};
        \node[below] at (4.5,0) {viele};
        \node[below] at (7,0) {\(>10^{23}\)};
    \end{CacheMe}
    \centering
    \caption[Aufspaltung der beliebigen Energieniveaus]{Aufspaltung der beliebigen Energieniveaus \(E_{n-1}\) und \(E_{n}\) bis sie in Bändern übergehen.}
    \label{tikz:niveauZuBändern}
\end{figure}
\end{document}

I imagined, that the main document parses all cached files, since they are .tex files, and therefore applies all settings in from main.

Oh no, the whole point is that the time-consuming part is to turn the .tex into a pdf when compiling a tikz picture. So if we just load the .tex, there is no time benefit, really, that's why we need to compile it to a pdf first.

Interestingly is the behavior different for a CacheMe tikz environment or a CacheMeCode gnuplot environment

This is normal: for gnuplot with a tikz terminal, there are 2 steps:

  1. when running gnuplot it creates a .tex
  2. then we \input that .tex file to generate an image

so by default, the gnuplot preset only caches 1, so gnuplot is only run the first time, but the image is generated from the .tex file each time you compile the document since the \input is still made in the main pdf, and therefore inherits from all fonts settings etc. If you want to cache step 2 as well, you either need to enable \cacheTikz or load the cache tikz style in the gnuplot style (cf the documentation for an example)… but in that case you will need to load the font manually of course as before.

I hope it is clearer, I just created a new section "A note on font handling" in the documentation.

tobiasBora commented 7 months ago

(note that if you use lualatex, you don't even need to change setmainfont, cf documentation)

dflvunoooooo commented 7 months ago

So you get an error only because you need \usepackage{fontspec}:

Oh that was stupid of me, sorry. I somehow thought \setmainfont was a XeLatex function.

Fully-automatically forwarding the current font by default is not so easy because there are so many ways to load a font in LaTeX (by loading a package like \usepackage{times}, by using xelatex or lualatex …

Yeah, that seems reasonable and impossible to check every possibility.

But I can understand that you might prefer to avoid duplicating the code setting the font for instance. The simpler is certainly to load the very latest version I just pushed, where you can run: \runHereAndInPreambleOfCachedFiles{ …

Perfekt, thank you for making my life easier! :) I will test the placeholder as well.

Oh no, the whole point is that the time-consuming part is to turn the .tex into a pdf when compiling a tikz picture …

That makes sense. Do I understand correctly, that this difference is between CacheMe and CacheMeCode? The first will cache the conversion from code to pdf and the latter will cache the executen of the code to tex?

This is normal: for gnuplot with a tikz terminal, there are 2 steps: …

I see. Thank you very much for the clarifications! This is already explained in the documentation, I am blown away! Great work!

tobiasBora commented 7 months ago

That makes sense. Do I understand correctly, that this difference is between CacheMe and CacheMeCode? The first will cache the conversion from code to pdf and the latter will cache the executen of the code to tex?

Not at all: CacheMe and CacheMeCode are basically doing exactly the same thing, except that CacheMeCode handles better some characters that you will not find in a normal LaTeX document, like %, # etc…

More precisely, in CacheMe for instance, any string starting with a % will be removed as it is a comment in LaTeX, two consecutive new lines will be removed and replaced with \par, a space will be added after most macros etc… While this is not a problem for LaTeX code, this can change importantly a python code for instance. On the other hand, CacheMeCode will preserve these characters (at least if you do not nest it inside other macros and some environments), but the actual code of CacheMeCode is not trivial compared to CacheMe as it needs to write the content of the string to a file etc…

So that's why I recommend to use CacheMeCode for gnuplot/python/… since you certainly want to use % or #, while CacheMe is typically used for LaTeX. But you can completely use one instead of the other if you do not have weird characters/new lines etc.

What dictates the action that you do is the options/presets that you send them. These options will be executed after setting the placeholder __ROBEXT_MAIN_CONTENT_ORIG__ to the text typed by the user in the CacheMe* environment, and at the end of the execution of the preset, we should set basically 3 placeholders as explained in the section How it works of the documentation:

So for instance, if you want to do a preset that does basically nothing (i.e. create a .tex file and includes that file), you can do (not tested):

\robExtConfigure{
  new preset={my useless preset}{
    % Says that __ROBEXT_TEMPLATE__ should be replaced with __ROBEXT_MAIN_CONTENT_ORIG__, itself equal to the
    % text typed by the user.
    % (better to use the shortcut `set template={__ROBEXT_MAIN_CONTENT_ORIG__}`):
    set placeholder={__ROBEXT_TEMPLATE__}{__ROBEXT_MAIN_CONTENT_ORIG__},

    % We say that the compilation command just copies the source to the output file (must be present unless
    % a compilation error occurs). Better to use the shortcut `set compilation command={cp …}`.
    set placeholder={__ROBEXT_COMPILATION_COMMAND__}{cp "__ROBEXT_SOURCE_FILE__" "__ROBEXT_OUTPUT_PDF__"},

    % We say that we must include the __ROBEXT_OUTPUT_PDF__ file (whose value is equal to \robExtFinalHash.pdf,
    % even if it contains no pdf). We just need to add the path to the cache folder before:
    custom include command={\input{\robExtAddCachePathAndName{\robExtFinalHash.pdf}}},
  },
}

and then:

\begin{CacheMe}{my useless preset}
Foo
\end{CacheMe}

would be exactly like writing Foo directly.

Of course, this is fairly useless since the time to compile this is even longer. So it is helpful to choose better compilation commands, for instance to pre-build a PDF for an image as tikz is really long to run. The latex preset is basically (in its simplified form, check the doc for its exact form which handles depth, overlay etc…):

\robExtConfigure{
  new preset={my simple latex preset}{
    % Says that __ROBEXT_TEMPLATE__ should be replaced with __ROBEXT_MAIN_CONTENT_ORIG__ + the latex template
    % (better to use the shortcut `set template={…}`):
    set placeholder={__ROBEXT_TEMPLATE__}{\documentclass{standalone}\begin{document}__ROBEXT_MAIN_CONTENT_ORIG__\end{document}},

    % We say that the compilation command must run pdflatex:
    set placeholder={__ROBEXT_COMPILATION_COMMAND__}{pdflatex "__ROBEXT_SOURCE_FILE__"},

    % We say that we must include the __ROBEXT_OUTPUT_PDF__ file (whose value is equal to \robExtFinalHash.pdf,
    % we just need to add the path to the cache folder before):
    custom include command={\includegraphics{\robExtAddCachePathAndName{\robExtFinalHash.pdf}}},
  },
}

This way, typing:

\begin{CacheMe}{my simple latex preset}
Foo
\end{CacheMe}

Will create a latex file containing:

\documentclass{standalone}\begin{document}Foo\end{document}

compile the document with:

pdflatex somesource.tex

to obtain a file somesource.pdf, and the output file will be included with:

\includegraphics{robustExternalize/somesource.tex}

according to the 3 commands specified above.

For gnuplot this is similar, except that the compilation command is like gnuplot -c thesourcefile. If the terminal is like pdf, the include command is like \includegraphics{someoutput.pdf}, but if it is in the tikz terminal, we do instead \input{theoutput.tex}. You can see that if theoutput.tex is doing some heavy tikz picture drawing, this will not speed-up that part. If you want to also speed up that part, you need to recursively call the caching procedure in the content of the theoutput.tex file. This is what cache tikz/\cacheTikz does, by simply calling the tikz preset on all tikz pictures.

Note that the cairolatex terminal is a bit different: even if we \input a file, it is really quick as the file just includes a pre-build pdf and annotates it without using tikz pictures at all, so no need to call recursively another caching procedure.

I hope this clarifies a bit the story.

dflvunoooooo commented 7 months ago

Thank you again for your very detailed explanation. To summarise this for my understanding: If a second caching is done depends on the definition of the environment. For gnuplot with tikz for example there is a second layer gnuplot -> tikz -> latex. And the second layer only gets cached if said option is set. Gnuplot will most of the time need a second caching, except for cairolatex, right?