Closed DHowett-MSFT closed 6 years ago
We encountered this in a situation where weak references were under high demand (turnover of >~100/second); the application would abort due to a zeroing weak reference not zeroing.
Ouch. I thought I'd fixed that about 7 years ago.
At some point, I plan on migrating a load of the code to C++ and using a better off-the-shelf hashtable implementation. I've been investigating the Tessil ones for this purpose. I want to get the new ABI stuff finished first though, because attempting to merge that and a bunch of file renames is going to be painful.
Do you have a reduced test case for this that you can commit?
This commit fixes a data loss bug in our hopscotch table implementation. Removing values from the table can result in other values becoming disconnected and lost.
Let A, B, and C be values that all hash to cell 0. Assume the hopscotch distance factor H = 2.
After adding A
After adding B
After adding C
If we then remove B,
If we then remove A,
The bug manifests if [X] the placeholder value passes the null check set out in MAP_TABLE_VALUE_NULL; that is, the placeholder is "effectively null".
Looking up the key that matches C will first evaluate its base cell, the one that collided with the key in the first place. Since that is placeholder [X], and [X] is "effectively null", the lookup stops.
C is never retrieved from the hash table.
The expedient solution to this bug is to update cell 0's neighbors when B is first removed, effectively skipping the hole:
If we remove B as above,
but clear the neighbor bit for cell 1, the promotion that happens when A is later removed places C in cell 0.