Closed foolnotion closed 5 years ago
Can't reproduce.
Providing the entire relevant part of the file (as a gist for example) could help a lot in troubleshooting, as well as what compiler / platform you're using (since I'm getting different hashes than both of your results and that's enough of a "uh oh" thing in itself).
For now, please report what hash results do you get when defining
#define XXH_FORCE_MEMORY_ACCESS 0
before including xxhash.hpp
.
#define XXH_FORCE_MEMORY_ACCESS 0
does not change anything:
hash(123 456 0) = 14523173615704738576
array hash = 13763445824703203362
I am using arch linux in the windows WSL. Tested with gcc-8.2 and clang-7.0.0, same results.
I've made a gist with my files: https://gist.github.com/foolnotion/46b2d9d54f2939487c4805b7830f13bc#file-sentence-cpp-L114
Basically, sentence.cpp
line 114 is where the hashing happens. Then in main.cpp
I just test again for the same values.
Thanks.
Alright, the bug you encountered is now solved. Copying a "solution" from StackOverflow instead of actually investigating the rules of lifetimes of static objects in C++ was bound to bite me in the ass one day anyway.
The bigger bug, however is that your hash values and mine don't match. Which is a very bad, very big issue for a hash algorithm to have.
I will investigate into it tomorrow, set up Travis CI and see whether the discrepancy is compiler dependent or caused by something else.
I'm sorry for any difficulties this mess has caused.
Thanks for taking the time to investigate this. Indeed, not getting the same hash value for identical input is a bit worrisome.
I've tested on 2 machines that have identical WSL environments but different hardware.
In all cases I get the same hash value xxhash({123, 456, 0}) = 13763445824703203362
. I will be happy to help you test different things.
Alright, I believe I found the result of the inconsistency. And, unsurprisingly, it turned out to be just a typo.
#if defined(_MSC_VER)
static inline uint32_t rotl32(uint32_t x, int32_t r) { return _rotl(x, r); }
static inline uint64_t rotl64(uint64_t x, int32_t r) { return _rotl64(x, r); }
#else
static inline uint32_t rotl32(uint32_t x, int32_t r) { return ((x << r) | (x >> (32 - r))); }
static inline uint32_t rotl64(uint64_t x, int32_t r) { return ((x << r) | (x >> (64 - r))); }
#endif
64-bit rotl on non-msvc compilers accidentally returned only a 32 bit value. After fixing this, the results now seem to be identical.
The upside is that I now have a CI set up, so it'll be easier to catch these sorts of things in the future.
That's great. Just for the record the new hash value is 7271121650880935067
. If that's identical with what you have I think you can close this bug.
Bug fixed, closing.
Hi,
Apologies in advance for the vague description. I am not really sure what's going on so I'll just do my best to describe my usage scenario:
Somewhere in my program I am building a
std::vector<uint64_t>
:This outputs:
I wanted to debug this result so in
main.cpp
I did:This outputs:
As you can see there's an inconsistency even though the values are the same.
The next thing I added another test in my
main.cpp
:And now the weird part. After adding the above two lines, I get:
Now the hashes are consistent as they should be. Could this be a bug?
EDIT: The problem seems to go away if I manually define the endianess before including
xxhash.hpp
:Best, Bogdan