mrpiggi / svg

Handling SVG pictures in LaTeX documents using Inkscape, ImageMagick and/or Ghostscript
Other
64 stars 12 forks source link

Including multiple svg files with equal file names but stored in different folders #11

Open baloe opened 4 years ago

baloe commented 4 years ago

The following snippet

\includesvg[]{folder1/graphics.svg}
\includesvg[]{floder2/graphics.svg}

produces one single file pair

svg-inkscape/graphics_svg-tex.pdf(_tex)

which is then embedded twice. Is there a way of preventing this behavior?

baloe commented 4 years ago

ah I guess I could use the option inkscapepath=svgsubdir

baloe commented 4 years ago

although .. having the generated pdf(_tex) files in a single directory is quite elegant since this directory can then later be deleted to save storage. Wouldn't it make sense to compute a hash string from the path to the svg file and then use that hash as a unique file name in the default subfolder svg-inkscape?

mrpiggi commented 4 years ago

That's a very charming idea. I will try to implement that in the next version.

michael-markl commented 2 years ago

In my workflow, I often replace svg files (that are auto generated with my own tools). ~Currently, I always have to manually delete the generated files in the "svg-inkscape" folder to get a new variant of the svg file compiled.~ Hence, I think, rather than hashing the file path, it would probably make more sense to compute a hash based on the svg content? (Then in an unlikely event of having duplicate svgs at different locations, it would even be enough to generate the PDFs once.)

Edit. I just noticed, that it is unnecessary to delete the svg-inkscape folder because of updated svgs: The detection is already done via the modified date 😍 . (However, I'll leave my comment here, maybe it is still a good idea to hash based on the contents rather than the file path?)

baloe commented 2 years ago

Hence, I think, rather than hashing the file path, it would probably make more sense to compute a hash based on the svg content? (Then in an unlikely event of having duplicate svgs at different locations, it would even be enough to generate the PDFs once.)

The thing is that one can have identical svg files in different locations each referencing external image files via relative paths. In this case, the svg files can produce different-looking images even though their hashes are identical.

So, one would need to preprocess the svg files by replacing relative paths with absolute ones before computing the hashes, or compute hashes of these linked files as well.

I would say it's not worth the extra trouble.

michael-markl commented 2 years ago

You're right, I was not thinking about referencing external files! Maybe it is still easier to use hash(fileName + fileContents) instead of monitoring the modifiedDate when it comes to updating svg files. However, I guess that's up to the implementer / maintainer :)

schtandard commented 1 year ago

Maybe it is still easier to use hash(fileName + fileContents) instead of monitoring the modifiedDate when it comes to updating svg files. However, I guess that's up to the implementer / maintainer :)

This would mean that old versions in svg-inkscape are never overwritten, so the svg-inkscape directory will get bloated over time if you edit your images a lot. (One could of course devise some garbage collector routine, but this comes with its own disadvantages (e.g. it would discard data when only partially compiling the document) and is an unnecessary complication, I think.) I see no disadvantage in just hashing the file path and using the date to detect changes as it has been done now.


One workflow that has not been mentioned here before is editing the pdf_tex file after it has been created. While not very common, this can sometimes be useful and is explicitely supported by the package: If the pdf_tex file has been edited, i.e. it is newer than the pdf file, those files are never overwritten, even if the original svg file is still newer. (Instead, the user is warned about this.) This possibility should be preserved, which is another reason against hashing the file contents.

However, just using the file path hash as the file name in svg-inkscape would also complicate this workflow a lot, since one would first have to determine the correct hash to find the right pdf_tex file to edit. That's why I would propose preserving the original filename in svg-inkscape and only appending the path hash. (So for example \includesvg{img/test} would lead to test_⟨hash of the absolute path of "img/test"⟩.pdf being created.)

mrpiggi commented 1 year ago

That's exactly how I would implement it, appending the path hash to the file name. As I said before, however, I unfortunately lack the time to work on it right now.

HeinrichAD commented 1 year ago

As temporary workaround (not that flexible) I am using the passed image file path.

\makeatletter
\newcommand{\includesvggraphics}[2][\textwidth]{%
  \filename@parse{#2}%
  \includesvg[inkscapepath=svg-inkscape/\filename@area,width=#1]{#2}%
}
\makeatother

Usage:

\includesvggraphics{experiments/datasets/mnist.svg}
% or
\includesvggraphics[.5\textwidth]{experiments/datasets/mnist.svg}

Output:

svg-inkscape
└── experiments
    └── datasets
        └── mnist_svg-raw.pdf

Maybe it will help someone until the hash solution is implemented.

mrpiggi commented 1 year ago

A useful workaround though I won't implement this variant, since this wouldn't work with absolute or relative paths---pointing to folders outside the project directory.

MartinX3 commented 3 weeks ago

How about just appending a number to the end of the file counting its usage by path since the paths are unique for different images?