cppalliance / NuDB

NuDB: A fast key/value insert-only database for SSD drives in C++11
Boost Software License 1.0
384 stars 59 forks source link

How to calculate the pepper (and other hash values)? #1

Closed MarkusTeufelberger closed 8 years ago

MarkusTeufelberger commented 8 years ago

rippled seems to use xxhash64 as hashing function.

I checked the key file from my installation and don't seem to be able to reproduce the correct hash in python using https://github.com/ifduyue/python-xxhash.

import xxhash

salt = int("1234abc", 16)  # <-- of course I enter the actual salt here
realpepper = int("deadbeef", 16)  # <-- same for the pepper here

pepper = xxhash.xxh64(salt.to_bytes(8, byteorder="big"), salt).intdigest()
print(pepper = xxhash.xxh64(salt.to_bytes(8, byteorder="big"), salt).hexdigest())

assert (pepper == realpepper)

As far as I understand C++ templates, https://github.com/vinniefalco/nudb/blob/master/include/nudb/detail/format.hpp#L148 should seed the hasher on line 152 with the salt and then calculate the hash of the salt on line 153 (though I'm not so sure why you need the address of the salt?!).

vinniefalco commented 8 years ago

rippled uses a hash function that serializes integers in their native format rather than always converting to big or little endian: https://github.com/ripple/rippled/blob/44e33121c7c798daf3edfcaddbe8648a34f8a735/src/ripple/beast/hash/xxhasher.h#L47

MarkusTeufelberger commented 8 years ago

Alright, endianness was the issue.

Solution:

import xxhash

salt = int("1234", 16)  # <-- of course I enter the actual salt here
realpepper = int("abcd", 16)  # <-- same for the pepper here

pepper = xxhash.xxh64(salt.to_bytes(8, byteorder="little"), salt).intdigest()

print(xxhash.xxh64(salt.to_bytes(8, byteorder="little"), salt).hexdigest())
print("abcd")

assert (pepper == realpepper)
vinniefalco commented 8 years ago

@MarkusTeufelberger Unfortunately the behavior is a bug. NuDB should produce database files that can be opened on any platform, and the code that computes these numbers breaks that assumption. Unless we change the file format, the calculation of pepper has to be done with a little-endian salt. I will open an issue.

vinniefalco commented 8 years ago

Part of #13