Open CodeSandwich opened 7 months ago
The idea for the library was to provide a different alternative to the hashing option. That is dependent on the hash implementation versus having computation done on each of the bytes directly. I don't think I'd ever mask a simple hash comparison behind a lib, too.
In addition, have you tested and made sure it is more gas-efficient?
The idea for the library was to provide a different alternative to the hashing option. That is dependent on the hash implementation versus having computation done on each of the bytes directly. I don't think I'd ever mask a simple hash comparison behind a lib, too.
That's interesting, what for and why? Is hashing insecure?
Regarding gas usage, hashing costs 30 gas + 6 per 32-byte word, there's nothing to benchmark really, hashing is probably the cheapest thing you can do with an array of bytes in memory.
That's interesting, what for and why? Is hashing insecure?
Not really, but we're just giving optionality, I guess. I'm not afraid the keccak opcode is badly implemented in Geth.
But that's just the gas usage of the opcode. What about changing the memory pointer, filling the memory in the scratch space, etc.? Because my code does everything in place, we don't even touch the 0x0 - 0x40 scratch space like the Solidity compiler does to hash stuff together. What if you want to hash something that is bigger than 32 bytes? You need to copy multiple times. I'd say that the bigger the array, the more efficient my function is compared to hashing stuff together.
This being said, I never benchmarked it, so you may be right. I'd love to see some data to believe the statement, though.
Keccak works in-place, on an arbitrary memory range, check https://www.evm.codes. The scratch space is used mostly for calculating the storage slot indexes, because it involves repeated hashing of pairs of 32-byte words taken from the stack.
It's your project, so I can't tell you what to do, you'll do whatever you choose. I don't even use your code as a dependency, so I won't see the benefits of any optimization. All I'm saying is that you should check out hashing because it's a fairly popular strategy for reducing gas usage. Plus you'll drop a lot of complex Yul code.
I just benchmarked it, and you're totally right. Leaving the data here for posterity.
This is for a 4-byte-long array comparison with Keccak:
This is for the same comparison with my function:
Trying longer lengths shows that my function actually scales just a smidge slightly worse than the Keccak comparison.
With this being said, if the lengths do not match, my function is O(1) and performs better than the Keccak counterpart in any case.
To be honest, I've done this such a long time ago that I don't even know what I was thinking now, about memory manipulation of the Keccak opcode. It probably had something to do with the storage slot calculations, like you mentioned. 🤷♂️
Although this is all true, I still don't think I should change this function. First, because if using the Keccak version is what people want to do, using it through a library makes no sense since it is a one-liner. Second, because I've made this lib such a long time ago (and it's still consumed by some major projects) that I'm sincerely afraid to change any implementations at this point (e.g., that's why I merged the PR for the non aligned comparison as a new function instead of adding to the previous one). Third, because I like how this still serves as a well-commented, teaching example of the inner workings of the EVM. Fourth, because it's another option. 😄
But I sincerely thank you for bringing this up! 🙏 At least now we have some comparisons and options. Also happy to keep this issue open so people notice it when arriving at the repo. 👍
With this being said, if the lengths do not match, my function is O(1) and performs better than the Keccak counterpart in any case.
From the very first proposed version in this issue hashing is done only if the lengths are different. Omitting this path doesn't yield meaningful results.
I couldn't believe it so I wrote a benchmark too, here are the results. Hashing has a predictable cost which is a useful property on blockchains, and is usually faster, often by a lot. The only edge case are arrays of over 256 bytes where the very first byte is different, for which the loop-based implementation has the O(1) cost, so at some point it gets better than hashing.
The runs marked as DIFFERENT ALIGNED AND NON ALIGNED!
have equal
and equal_nonAligned
yielding different results. I think that regardless of switching to hashing, you should completely replace the equal
function with equal_nonAlighed
, the used algorithm just returns invalid values. Its not even a good teaching example, it's a bug, maybe the auditors will learn a common pitfall, but that's all. Keeping it as it is doesn't help the downstream, it leaves buggy code unpatched and makes the obvious choice wrong, being correct should not be the special case.
I mean, the normal function works as long as you use vanilla Solidity or don't use Yul to mess with the memory pointer manually. That was always its intended use case. In fact its first version was reviewed together with the compiler team. I'm guessing only in situations where the array is non-aligned to 32-byte slots, you got the different values? If not, that actually is signalling a bug in one of the functions, and I'd love it if you could share the test code with me so I could debug.
I could consider changing the non-aligned function to become the main one, though, given we don't care about the gas savings.
Was I right about the above or should I be worried about a bug in the functions?
Also, please note this library is 7 years old and, again, I am terribly afraid of breaking shit downstream at this point. 🙏
share the test code with me so I could debug
It's the code I shared above, just run it with forge test -vv
and see which cases print DIFFERENT ALIGNED AND NON ALIGNED!
. It seems that it happens when the arrays lengths are a multiple of 32, and the only difference between them is the last byte. Unless via-ir
is used, then sometimes even a different first byte triggers the error.
Sorry! 😅🙈 Completely ignored the “Source” tab. I’ll take a look! 🙏
I found the bug. In equal_nonAligned
numTailBytes
should never be 0, for cases where it is, it should be set to 32. This is because endMinusWord
always leaves 1-32 unchecked bytes, never 0.
IMO this function is extremely overcomplicated, even forgetting about hashing for a moment. Instead of having a special routine for comparing the partial last word, you could just replace the initial let mc := add(_preBytes, 0x20)
with let mc := add(_preBytes, and(mload(_preBytes), 0x1F))
and replace endMinusWord
with the regular let end := add(_preBytes, add(length, 32))
. This way you will check part of the length word twice, but it won't affect the result. The cb
variable can also be dropped, it's always the same as success
.
My current span of attention for this being so thin was the reason why I merged this PR into a different function. I am actually happy I didn't merge or replace the normal equal function for this one like the contributor asked me to.
I need some proper time to review the bug and your proposal. But this is great, and I can't thank you enough for taking such a close look! 🙏
BytesLib.equal
andBytesLib.equal_nonAligned
can be replaced with a much simpler, more gas efficient and purely Solidity (YUL probably won't bring any benefits here):return _preBytes.length == _postBytes.length && keccak256(_preBytes) == keccak256(_postBytes);
.