Closed laverdet closed 7 years ago
@mscdex these look like some solid improvements. Any reservations on merging this so npmland can take advantage?
@pxwise I missed this. However, after briefly looking over the changes, I am a bit concerned about the change in output. @laverdet Can we keep the old format or is the change inevitable?
Also I'm not sure why ws2_32.lib is needed, but if we absolutely need it, it should go in binding.gyp instead of in a pragma.
The change to output is necessary. As mentioned in the original comment, the seed input is always big-endian but your output is platform-specific (little-endian on most systems).
Consider the following:
let payload = new Buffer('payload');
let result32 = xxhash.hash(payload, 1);
let resultBuffer = new Buffer(4);
xxhash.hash(payload, 1, resultBuffer);
console.log(xxhash.hash(payload, result32) == xxhash.hash(payload, resultBuffer));
This demonstrates the endian inconsistencies. This logs "false" currently, but after my fix it returns "true".
Winsock wasn't necessary I was just mad that htonl() was making the build fail and patched it in a stupid way. I rewrote that stuff to remove the silly endian checking and reinterpret_cast's entirely. Pull request should be updating now.
I went about this in a slightly different way in b9980c2 (by taking advantage of some xxhash APIs after an upstream xxhash upgrade). I also opted to use little endian as the common endianness since little endian is much more common and thus would affect less users. I tested these changes on a big endian system via qemu and all tests pass as-is now.
Thanks again for the PR and responses. Sorry it took so long.
Attached are 2 changes, one is pretty straight forward, the other is a breaking change but one I think needs to happen.
First is 6a183008e941a58e6810c4e5c52a59754d37dbb7: convert_seed in the 64-bit version currently uses uint32_t. This was a no brainer.
Additionally included are changes to thrown exceptions. The
seed_val.Clear()
trick and later check toinfo[0].IsEmpty()
never worked. Since you're passing a copy of the v8::Local<> you only clear the handle in the local scope. In the case thatconvert_seed
throws, you still end up calculating a hash and then it just so happens that the correct exception is thrown anyway. I fixed that to correctly detect the exception and return early.The second change is 21b13985ab4856e8a59dda99cb4185c27064ed29: Right now when you pass a buffer in as a seed value you are converting it to an uint32_t or uint64_t using big-endian format (the loop where you do seed <<= 8). But then when you return the hash you cast it to a char* and return that as-is. On x86 that will be little-endian, on other platforms it may be big-endian.
I updated the hash functions to always return the hash in big-endian format (which currently matches what convert_seed expects). Since it is not uncommon to calculate a hash, and then to use that hash as a seed for another hash the endianness should at least be internally consistent.
So with these changes we should expect the results of node-xxhash to be the same on all platforms, and using the result of hash as a seed for another hash to work as expected.