cabo / kramdown-rfc

An XML2RFC (RFC799x) backend for Thomas Leitner's kramdown markdown parser
MIT License
195 stars 83 forks source link

Make gzipped markdown source reproducible #166

Closed dkg closed 2 years ago

dkg commented 2 years ago

By avoiding setting the embedded timestamp in the gzip-encoded markdown source, we avoid a variation based on time of build.

cabo commented 2 years ago

I find it very useful to have that mtime. What are the specific circumstances where they get in the way?

dkg commented 2 years ago

Aside from my general preference for reproducible software, i find it pretty important to be able to compare versions of different drafts.

When comparing two copies of XML, the embedded markdown is either different or it is not different. If the only difference is in the embedded mtime, it shows up as a difference in the b64-encoded gzipped source. to determine that it's only the difference in the embedded mtime, and not, say, a difference in the metadata block at the start of the markdown, i have to either know the exact placement of the mtime in a gzip header and be able to visually identify it in b64 (a weird and very unfriendly skill!) or i have to un-b64 and decompress the embedded markdown and diff it.

Let me try asking the question in a different direction: where and when do you use that mtime? what is the advantage of having it? if the only thing it encodes is "now" (meaning, when kramdown-rfc is run) then surely that information is better recorded (or not recorded, if not needed) explicitly in the xml source, not in this obscure and difficult-to-read location. If it's supposed to encode the filesystem timestamp of the original markdown file, why is that useful? if i'm working from a new git checkout, the timezone will record when i did the clone, not when the file was modified. why would that information be something someone else needs to use?

cabo commented 2 years ago

I agree that deterministic builds are highly desirable. Having the time in the generated XML is occasionally useful to find out what happened, but maybe not that useful. So I'll cherry-pick this PR and make a fix.