Closed jbosboom closed 6 years ago
Since v1.4.2, I haven't encountered this. Have you? (I also looked for timestamps or similar in the PDF, but couldn't find any.)
I am still seeing this behavior with svgtiler 1.5.0 and Inkscape 0.92.2 5c3e80d, 2017-08-06
.
I'm busy with the deadline right now but I'll try to reduce a test case for you after.
Or at least let me know an example on our repo where this occurs. (I've tried some xlsx's but not all.) I assume you're running on Linux?
I generated vertical_dominoes_Literal Unset.pdf
twice and diffed them. They differ in only four bytes. I used qpdf's "qdf mode" to get a text representation of the PDFs and diffed those. They differ only in the timestamp shown below.
--- old-text.pdf 2018-02-22 20:53:12.399383687 -0500
+++ new-text.pdf 2018-02-22 20:53:57.618647915 -0500
@@ -13,7 +13,7 @@
%% Original object ID: 6 0
2 0 obj
<<
- /CreationDate (D:20180222204225-05'00)
+ /CreationDate (D:20180222204251-05'00)
/Producer (cairo 1.15.10 \(http://cairographics.org\))
>>
endobj
(They also differ in /ID
, but this seems to be automatically generated by qpdf because it changes if I add --deterministic-id
to the qpdf command line. The /ID
is a 16-byte value, but there were only four bytes of difference in the PDF files.)
Yes, this is on Arch Linux. Maybe Inkscape uses a different backend (not cairo) on other platforms.
Due to the impending deadline I'm going to just commit the differing files anyway, but now you've something to go on.
I'm using Inkscape 0.91 r13725 on Ubuntu which uses Cairo 1.14.6.
You seem to be using a different (later?) version of Inkscape which uses Cairo 1.15.10. So I'm guessing that newer Inkscapes inject the creation date like this. Now that I have an example, I should be able to blank out any such commands.
Of course, this won't help when you and I are recompiling with different Inkscape versions, so we generate different metadata (e.g. different /Producer
s). But it's better than nothing...
There are no /ID
s in the files, so far as I can tell. You should just look at PDF in a text editor or less
(go near the bottom), not qpdf
.
I tried installing Inkscape 0.92.2 via https://launchpad.net/~inkscape.dev/+archive/ubuntu/stable and it made no difference (but still used Cairo 1.14.6). So probably it's the difference in Cairo versions...
Anyway, I should have fixed this in 1.5.1 by blanking out /CreationDate
if detected in the PDF. Can you test?
With this change, I do get the same hashes after rebuilding PDFs I've built.
I suspect someone's already dealt with the problem of putting PDFs in a canonical form for digital signature purposes, but as you note, this is better than nothing.
https://github.com/matplotlib/matplotlib/pull/6597 seems to be one example of dealing with this, by injecting a CreationDate of SOURCE_DATE_EPOCH if set instead of the current date. I'm guessing Inkscape doesn't support this feature, though.
Repeatedly invoking
svgtiler -p
results in PDFs with different hashes. Given that we like to commit these build products in version control for the benefit of those without svgtiler installed, this results in committing files that didn't actually (visibly) change. It also makes it hard for humans to tell which sheet(s) of a workbook changed (so the changes can be reviewed).This is probably Inkscape's fault for storing a creation timestamp or similar in the compiled PDF, but it would be great to find a workaround.