dionhaefner / pgfcache

LaTeX package for caching of PGF figures created with Matplotlib, just like tikz-externalize
MIT License
12 stars 1 forks source link

Slow line-by-line file copying #1

Closed Callidior closed 4 years ago

Callidior commented 4 years ago

Thank you a lot for this very useful package!

I don't know if it just affects my system, but the line-by-line reading and writing of the PGF files takes up such an extraordinary amount of time on my system that it is much faster not to externalize my PGF figures at all, since this copy process also needs to be done for creating the tmp.tex file before comparing the hashes.

Since we already rely on shell escape anyway, I solved this problem by replacing the following part of the code:

\immediate\openin\@pgfin=#2/#3%
\begingroup\endlinechar=-1%
    \loop\unless\ifeof\@pgfin%
        \readline\@pgfin to \@fileline%
        \ifx\@fileline\@empty\else%
            \immediate\write\@pgfout{\@fileline}%
        \fi%
    \repeat%
\endgroup%
\immediate\closein\@pgfin%
\immediate\write\@pgfout{\string\end{document}}%
\immediate\closeout\@pgfout%

With this:

\immediate\closeout\@pgfout%
\immediate\write18{cat #2/#3 >> ##1}%
\immediate\write18{echo \string\\end{document} >> ##1}%

Using the shell for copying the file is much faster. Unfortunately, this solution relies on UNIX systems, though.

However, I am just leaving this approach here in case anyone else comes across the same problem and finds it useful.

dionhaefner commented 4 years ago

Interesting, thanks for the input. I haven't encountered this, so I'm trying to figure out what is different on your system. Which TeX distribution are you using? What kind of disk are the files on? And how big are your PGF files?

Callidior commented 4 years ago

I am using Texlive from the official repositories of OpenSUSE 42.2. Writing the contents of a 900 KB PGF file to the temp file takes about 5 minutes (just for this single file).

Callidior commented 4 years ago

The aforementioned PGF file can be found in this ZIP archive: deep-metric-learning.zip

dionhaefner commented 4 years ago

Are the files on a network share or something like that?

Output for me:

$ time pdflatex -synctex=1 -shell-escape -interaction=nonstopmode test.tex
...
1.30 real         1.23 user         0.05 sys

(first compile)

$ time pdflatex -synctex=1 -shell-escape -interaction=nonstopmode test.tex
...
0.52 real         0.49 user         0.02 sys

(subsequent compile)

With this tex file:

\documentclass{article}
\usepackage{pgfcache}

\begin{document}
    \importpgf{.}{deep-metric-learning.pgf}
\end{document}
Callidior commented 4 years ago

The issue is weird. Using your minimal example, it works as fast as expected. In my much larger actual document though, I experience the aforementioned problem. I also noticed that while I'm waiting for the temp file being written, the main LaTeX process consumes between 95% and 100% CPU, so it is not just waiting for IO, apparently.

It seems, we can enter this in the records as a very special and rare issue...

dionhaefner commented 4 years ago

Strange indeed.

Let me know in case you manage to reproduce this with a minimal example, then we can do some more debugging.