Closed Bramzor closed 8 years ago
When you decode binary data (what hash64()
returns by default) to a UTF-8 string (via .toString()
), you run the risk encountering byte sequences that are not valid in UTF-8. That is obviously what is happening in this case as you can see by the series of 0xef, 0xbf, 0xbd
bytes. Those bytes represent the UTF-8 replacement character (\uFFFD
) and replace invalid UTF-8 characters found during the decoding process.
Instead, you can either pass a different output encoding as the third parameter, or pass an encoding to .toString()
. The encoding you choose should be one that keeps binary data intact (e.g. 'hex'
or 'base64'
).
What I actually want to do is just hashing a string and not binary data. But I'm forced to provide a buffer because it doesn't support a string.
If I remove the toString() and create a JSON of the output after hashing, I get: {\"type\":\"Buffer\",\"data\":[227,107,211,121,232,51,217,85]} What I'm looking for is something like: 2844c4aa8ad49a19
As I said, you can specify a better (output) encoding two different ways:
XHash.hash64(new Buffer(data.toLowerCase()), seed).toString('hex');
or:
XHash.hash64(new Buffer(data.toLowerCase()), seed, 'hex');
The former returns a Buffer containing the hash and then calls buffer.toString()
to convert the binary hash contents to something printable.
The latter converts the contents internally and directly returns a string of the passed encoding.
That did it. Wouldn't have been able to figure this out myself for some reason. Thanks a lot! Maybe an idea to add this to the readme :)
It is already in the readme :-)
Actually the issue was not fixed at all. I'm using https://asecuritysite.com/encryption/xxHash to validate the result but because I'm providing the information to a buffer, the hash result is something completely different. So I assume there is no proper way to hash a string with this implementation?
The values I tried actually match those coming from that site, despite being a different version of xxHash. The difference is that website is explicitly converting at least the 64-bit values to big endian format. If you reverse the hex result you will see it matches what this module returns. I suppose I could add a third parameter that converts to big endian if the host CPU is little endian....
I'm trying to use your module for hashing a string. Tried this but for some reason, I'm not able to do this correctly because it requires a buffer instead of a string? XHash.hash64(new Buffer(data.toLowerCase()), seed).toString(); But this returns something like: \xef\xbf\xbd\xef\xbf\xbd....