CouleeApps / git-power

git is a blockchain. Start your commit hashes with 00000000 like a real blockchain should.
MIT License
252 stars 8 forks source link

Performance tuning suggestions #1

Open not-an-aardvark opened 3 years ago

not-an-aardvark commented 3 years ago

Since it seems like more people have joined the "customize git commit hashes because why not" bandwagon (welcome!), I figured I'd share some insights from doing a lot of performance tuning on my own implementation from a few years ago. Obviously, none of this should be taken to discourage new implementations or to ruin any fun.

In decreasing order of effectiveness, the main things that helped were:

(Also mentioning @mkrasnitski from https://github.com/mkrasnitski/git-power-rs)

CouleeApps commented 3 years ago

Thanks for the optimization tricks! Glad to see other people have had similar ideas-- it seemed pretty obvious when I wrote it.

I'll have to look into caching SHA1 state and brute forcing from the end. Currently, the nonces are stored about halfway through the data (example here) so that would likely be a significant speedup.

As for a GPU implementation, that would be significantly faster. I haven't had time to implement that yet and wasn't sure if it was worth the effort. Seeing that you've done it for basically the same problem is certainly good news though.

mkrasnitski commented 3 years ago

If the goal is to hide our changes to the commit, wouldn't this make it unviable to append to the end of a commit? The nonce will appear in the commit message this way, and even extra whitespace is technically detectable (if non-standard whitespace chars are used). This would also not be compatible with signed commits, where we are basically forced to put the nonce in the middle of the commit data.

EDIT: Thinking a bit further, it may still be worth it to cache the state of the SHA-1 buffer up to the location of the nonce, so basically just the commit header.

not-an-aardvark commented 3 years ago

If the goal is to hide our changes to the commit, wouldn't this make it unviable to append to the end of a commit? The nonce will appear in the commit message this way, and even extra whitespace is technically detectable (if non-standard whitespace chars are used).

The way I implemented this was to append a combination of spaces and tabs to the end of the commit message. This is technically user-visible, but I haven't found it to be a noticeable problem in practice (e.g. GitHub automatically trims commit messages when displaying them in the web UI).

This would also not be compatible with signed commits, where we are basically forced to put the nonce in the middle of the commit data.

You're right about signed commits -- for that case I put the whitespace at the end of the signature (and cache up to that point). This reduces performance compared to the no-signature case, but is still much faster than not caching at all, or putting the whitespace at the beginning of the signature.

Thinking a bit further, it may still be worth it to cache the state of the SHA-1 buffer up to the location of the nonce

Yes, this is a good idea. To ensure that the number of 64-byte blocks that need to be rehashed on each attempt is minimal, I also add padding before the nonce so that the nonce starts at an offset of 64 bytes from the start of the commit header (the full padding format is described here). My nonces are 48 bytes long (to allow for enough entropy due to only containing spaces and tabs), so this might be less likely to be necessary when using shorter nonces.