Closed 97jaz closed 7 years ago
The logic you referred to is correct; it's an optimization that avoids recalculating a fraction of the hash-code, because its already known. I answered the same question in issue #9 as well:
The lines you mentioned create a (temporary) node that either gets unboxed when returning from recursion, or it becomes the new root node. I'm passing 0 as shift to the updated method, thus I'm only interested in the first 5 bits of the hash code. Because all elements in that node share the same common prefix, it is not necessary recalculate the hash code, I can simply take keyHash.
Here's a visualization, if we assume that we found the element to be removed at shift
15 :
... - mask(keyHashOfRemoved, 15)
\
- ... - mask(_, 10) - mask(_, 5) - mask(_, 0)
/
... - mask(keyHashOfRemaining, 15)
You can use keyHashOfRemoved
or keyHashOfRemaining
for substituting _
, because all mask
values for shifts lower than 15 (and thus also for the case of shift
0) equal.
In case you're going to reimplement the logic, you could either explicitly recover keyHashOfRemaining
to make the intent clear (at some performance cost), or properly document this optimization in the code (something I should do as well).
Maybe add a comment -- like the above text -- in the code to clarify this?
Because all elements in that node share the same common prefix [...]
Ah, of course. Thank you, @msteindorfer -- I should have realized that.
Is this logic for selecting the
newDataMap
for a new singleton node correct?Specifically, in the case where
shift
is not0
,newDataMap
is set tobitpos(mask(keyHash, 0))
. ButkeyHash
is the hash of the key that's being removed, right? (And the data map should reflect the key that remains.)Or am I misunderstanding what's going on in that code?