sebelga / nsql-cache

Advanced cache layer for NoSQL databases
MIT License
9 stars 4 forks source link

Incorrect results due to hash key conflicts #7

Open gpeddle-teal opened 3 years ago

gpeddle-teal commented 3 years ago

The hash keys are only 31 bit positive integers. From someone else's analysis I see that:

"Assuming a 32-bit hash and k=10,000 items, a collision will occur with a probablility of 1.2%. For 77,163 samples the probability becomes 50%!"

It does not appear that there's any code to compare the original keys and detect these conflicts. Even with a moderately sized cache errors go undetected.

I would suggest either a 64-bit hash or a mechanism for detecting the errors.

ablewalmsley commented 3 years ago

one existing workaround for this would be to set hashCacheKeys to false in your config

sebelga commented 3 years ago

Hi @gpeddle-teal Thanks for pointing this out! I don't have the bandwidth for the moment to make the change. If you want to open a PR and replace the hash function here https://github.com/sebelga/nsql-cache/blob/master/lib/utils.js#L28 that'd be awesome. Cheers!

Hardsix commented 2 years ago

You guys could at least make the option hashCacheKeys false as default, or refer to this in examples. One example has hashCacheKeys: true and the other example doesn't mention it. This leads people to think it is false by default.

This bit my company hard and caused a major prod issue due to cache key conflicts. I should've reviewed this dependency better but still.