gnustep / libobjc2

Objective-C runtime library intended for use with Clang.
http://www.gnustep.org/
MIT License
434 stars 118 forks source link

fix a mismanagement of the hash table that could lead to data loss #53

Closed DHowett-MSFT closed 6 years ago

DHowett-MSFT commented 6 years ago

This commit fixes a data loss bug in our hopscotch table implementation. Removing values from the table can result in other values becoming disconnected and lost.

Let A, B, and C be values that all hash to cell 0. Assume the hopscotch distance factor H = 2.

   0     1     2
+-----+-----+-----+
|     |     |     |
+-----+-----+-----+

After adding A

   0     1     2
+-----+-----+-----+
|  A  |     |     |
+-----+-----+-----+
   |
   +-Neighbors =

After adding B

   0     1     2
+-----+-----+-----+
|  A  |  B  |     |
+-----+-----+-----+
   |
   +-Neighbors = 1

After adding C

   0     1     2
+-----+-----+-----+
|  A  |  B  |  C  |
+-----+-----+-----+
   |
   +-Neighbors = 1, 2

If we then remove B,

   0     1     2
+-----+-----+-----+
|  A  | [X] |  C  |
+-----+-----+-----+
   |
   +-Neighbors = 1, 2

If we then remove A,

   0     1     2
+-----+-----+-----+
| [X] | [X] | [C] |
+-----+-----+-----+
   |
   +-Neighbors = 2

The bug manifests if [X] the placeholder value passes the null check set out in MAP_TABLE_VALUE_NULL; that is, the placeholder is "effectively null".

Looking up the key that matches C will first evaluate its base cell, the one that collided with the key in the first place. Since that is placeholder [X], and [X] is "effectively null", the lookup stops.

C is never retrieved from the hash table.


The expedient solution to this bug is to update cell 0's neighbors when B is first removed, effectively skipping the hole:

If we remove B as above,

   0     1     2
+-----+-----+-----+
|  A  | [X] |  C  |
+-----+-----+-----+
   |
   +-Neighbors = 2 <<< HERE

but clear the neighbor bit for cell 1, the promotion that happens when A is later removed places C in cell 0.

   0     1     2
+-----+-----+-----+
|  C  | [X] | [X] |
+-----+-----+-----+
   |
   +-Neighbors =
DHowett-MSFT commented 6 years ago

We encountered this in a situation where weak references were under high demand (turnover of >~100/second); the application would abort due to a zeroing weak reference not zeroing.

davidchisnall commented 6 years ago

Ouch. I thought I'd fixed that about 7 years ago.

At some point, I plan on migrating a load of the code to C++ and using a better off-the-shelf hashtable implementation. I've been investigating the Tessil ones for this purpose. I want to get the new ABI stuff finished first though, because attempting to merge that and a bunch of file renames is going to be painful.

davidchisnall commented 6 years ago

Do you have a reduced test case for this that you can commit?