Open martinetd opened 1 year ago
Humm, perhaps we can add an extra pass to create just the CUs, sorted by name, then make the BTF encoding ordered by CU name somehow, that probably end up causing some performance penalty as sometimes a BTF encoder thread would have to wait for the next (sorted by name) CU to have its DWARF processed, so would require some command line option for enabling it, maybe --reproducible-output.
I agree sorting is probably the most straightforward solution. It seems a bit of a shame to sort before the parallel BTF processing as that'll require threads to wait for each other as you pointed out -- sorting the final output is more difficult? It also doesn't have to be a costly sort like CU name, but could be pure input order e.g. something like adding a counter:
dwarf_cus__nextcu
remember input cu number and increment it there under lockShould be possible without too much of a slow down, just holding the memory associated with the output in a temporary list until it's ready to be copied off.
Regarding extra command line switch (if slowdown requires it), it's not trivial to add options to all users of pahole (e.g. linux build), so basing the decision on SOURCE_DATE_EPOCH or another env var might be more easy to use, but I guess we can figure that out later.
Hi,
Coming from https://github.com/NixOS/nixpkgs/pull/231768
The dwarf -> BTF conversion multithreaded process just spawns threads which consume the next dwarf cu in turn whenever they're ready, and output whenever they're done, which leads to non-reproducible output as the processing time isn't guaranteed.
I don't see an obvious solution with the current code (there's some reordering for rust, would that work without too big of a slow down?), but I figured I'd bring it up here first for ideas. The workaround that'll likely be used for nixos is disabling threads if SOURCE_DATE_EPOCH is set (as that most likely means a reproducible build was intended), but we'll be happy to try something else.