mgieseki / dvisvgm

A fast DVI, EPS, and PDF to SVG converter
https://dvisvgm.de
GNU General Public License v3.0
304 stars 31 forks source link

Deterministic SVGs with --font-format other than svg #120

Open shreevatsa opened 4 years ago

shreevatsa commented 4 years ago

When I run dvisvgm multiple times on the same DVI file, with --font-format set to anything other than svg, the resulting SVG files are different (not always, but at least about 10% of the time). This makes it inconvenient to have the SVG files checked into version control, or to run regression tests on the steps leading up to their creation.

The command is something like:

dvisvgm --page=1- --font-format=$format foo.dvi

where $format is one of ttf, woff, or woff2. The diffs are all in the @font-face lines, i.e. the actual base-64 encoding of the font.

Is it known where this non-determinism/randomness come from? Is there a way to avoid it (set a random seed or something)? It possibly has something to do with the time, as using faketime helps a bit.

mgieseki commented 4 years ago

I have to look more closely into the sources, but I guess you're right. It's probably the fontforge library that writes the current date/time into the TTF header which also affects WOFF and WOFF2 as they are basically compressed TTF files.

shreevatsa commented 4 years ago

Thanks, that is probably it. Meanwhile, after reading the libfaketime documentation further, I discovered a way to use it that seems to help:

  1. The first time, run with FAKETIME_SAVE_FILE set and an arbitrary starting timestamp, e.g.

    FAKETIME_SAVE_FILE='./saved-fake-times' faketime '2008-12-24 08:15:42' dvisvgm --page=1- --font-format=woff2 foo.dvi
  2. On subsequent runs, use this same file, with FAKETIME_LOAD_FILE:

    FAKETIME_LOAD_FILE='./saved-fake-times' faketime '2008-12-24 08:15:42' dvisvgm --page=1- --font-format=woff2 foo.dvi

This seems to result in the same SVG files across runs, though I'm yet to try it with a larger file. The same font still varies across the SVGs corresponding to different pages, but across runs it seems to be consistent. Anyway, this is probably not the ideal solution.

felixlen commented 4 days ago

I am also running into this problem and due to system restrictions am unable to use the faketime solution. If I got it correctly, dvisvgm is not using fontforge any longer, so this should not be the source of the problem? I have also tried with a pdf source and mutool instead of dvi and gs, but am facing the same issue. The diff is again the base64 encoded embedded woff2. Any help is much appreciated.