pseudomuto / protoc-gen-doc

Documentation generator plugin for Google Protocol Buffers
MIT License
2.59k stars 462 forks source link

Deterministic sorting of directories and files #507

Closed S1artie closed 1 year ago

S1artie commented 1 year ago

We are using protoc-gen-doc on multiple platforms (Windows, Linux and macOS) to generate Markdown documentation that is eventually being versioned. It is thus of utmost importance to have deterministic generation behavior - the same proto source files must result in the same doc file on all platforms.

The current implementation does not provide deterministic sorting of proto files - it seems to simply use the order in which files are provided by the OS, which appear to differ between Windows and macOS/Linux.

This PR implements an explicit ordering of files to process before the documentation is rendered out. It sorts the files by splitting their absolute paths in path elements, then ordering each "level" individually, effectively resulting in a "directory tree" style order that seems, at least to me, to be the natural order you would want to have things to be ordered in the docs:

  1. /top1/a.proto
  2. /top1/sub1/a.proto
  3. /top1/sub1/b.proto
  4. /top1/sub1/bot1/d.proto
  5. /top1/sub1/bot2/c.proto
  6. /top1/sub2/x.proto
  7. /top2/z.proto
  8. /top2/sub1/y.proto

This patch was successfully tested to produce the exact same doc files on Windows and Linux (they even have the same md5 hash).

S1artie commented 1 year ago

ping @pseudomuto as requested

S1artie commented 1 year ago

Okay, forget about this PR. It was born by a misunderstanding on my side: the file generation order is already deterministic: it's the order in which each file is mentioned explicitly to protoc. The entity responsible to create that command line (in my case that's a Maven plugin) is the one I had to look at for implementing deterministic order across multiple OSes.

It's thus counterproductive to implement any sorting code in protoc-gen-doc.