mrpiggi / svg

Handling SVG pictures in LaTeX documents using Inkscape, ImageMagick and/or Ghostscript
Other
65 stars 11 forks source link

Call inkscape --shell with batches of files for faster processing #52

Open MayeulC opened 1 year ago

MayeulC commented 1 year ago

I am not sure how doable it is, but I would prefer the package to assemble a list of all svg files and call inkscape only once in batchmode, to convert all the files in one invocation. This is typically much faster than one invocation per file.

The batch interface changed a bit recently. This is what I do to convert pdf to svg with the current (1.2.2) version:

(for file in *.pdf; do echo "file-open:$file; export-filename:$(echo $file | sed s/.pdf/.svg/); export-do;";done ) | inkscape --shell

(yes, I could have bash variable text substitution, but I think calling sed is more portable?).

export-latex can be used as well.

This could roughly results in the following sequence:

$ inkscape --shell
file-open:file1.svg; export-filename:file1.pdf; export-latex; export-do;
file-open:file2.svg; export-filename:file2.pdf; export-latex:false; export-do;
file-open:file3.svg; export-filename:file3.pdf; export-latex:true; export-do;

For reference, I used to do the following pdf->svg conversion in a Makefile with an older version of inkscape:

# Yes, the pattern makes no sense, it's just to tell make that these are built all at once
# https://stackoverflow.com/questions/2973445/gnu-makefile-rule-generating-a-few-targets-from-a-single-source-file
$(PDF_SVG_FILES:%-svg-converted-to.pdf=%-svg-converted-to.pd%): $(SVG_FILES)
        # converting svg files with inkscape...
        (for file in $?; do echo $$file --export-pdf=$$(echo $$file | sed s/.svg/-svg-converted-to.pdf/);done ) | inkscape --shell

But I'm not sure it's worth it to support older versions with a new feature.

If it's not possible to do so (if files need to be passed to the pdf engine right away), I suggest adding an option to allow skipping graph generation on the first pass. Ideally, placeholders of the right size should be created, but I am not sure this is possible either.

I don't mind compiling once more, personally (currently, I run a first --draft-mode compilation anyway, plus ~3 others), especially if it leads to much faster compile times overall.

mrpiggi commented 1 year ago

Well, maybe something to consider. Although, I am not sure, if it is worth the effort as this would only speed up the first compilation. The Inkscape export is only invoked, if a SVG file is newer than the resulting PDF export or the latter is missing at all. Once PDF files were generated, those are exported again only if the corresponding SVG file was updated in the meantime.

Andonome commented 1 year ago

On one of my larger projects, a standard compile takes me 54 seconds on the first run, and 37 on the next, so Inkscape slows things down by ~17/54 seconds. On the raspberry pi, the first compile is 6 minutes, and subsequent compiles are 5 minutes, 9 seconds.

So Inkscape's extra time is between 1/3rd and 1/6th of the total time. It'd be nice to cut that in half, but anything less than half the time improvement doesn't seem like it'd add much that's noticeable.

mrpiggi commented 1 year ago

I will have a look at this for the next version once I find some time. Unfortunately, this is more of a long-term issue.

MayeulC commented 1 year ago

On one of my larger projects, a standard compile takes me 54 seconds on the first run, and 37 on the next, so Inkscape slows things down by ~17/54 seconds.

I have to benchmark/profile this on my own. On my PhD thesis, a compile-from-scratch (3 passes, lualatex) takes about 26 minutes in GitLab CI. A lot of that is probably due to tikzexternalize taking longer, but every bit helps.

A few years back (~2018), when I was using a Makefile-based approach, individually converting a dozen figures or so with Inkscape took a substantial amount of time compared to batch-compiling everything, maybe 1 minute less out of 3 minutes.

I am motivated to speed up initial compile time as it sometimes times out in Overleaf, and it appears that subsequent compilation runs restart from scratch if there was a compile error (or so it seems).

Here's a small benchmark with all the svg files of my manuscript (on an i5-4260U CPU @ 1.40GHz, Arch Linux):

% ls *.svg |wc -l
119
% time (for file in *.svg; do inkscape $file -o $file.pdf; done)
( for file in *.svg; do; inkscape $file -o $file.pdf; done; )  62,47s user 8,33s system 98% cpu 1:11,85 total
% time ((for file in *.svg; do echo "file-open:$file; export-filename:$file.pdf; export-do;";done ) | inkscape --shell)
( ( for file in *.svg; do; echo ; done; ) | inkscape --shell; )  11,87s user 0,63s system 91% cpu 13,670 total

It's only one minute less during the first compilation, but that may be worth it, especially for more complex cases. In this case, it's much more than a 50% improvement, it divided the time by ~5 (one fifth the time). So quite noticeable.