GlenKPeterson / Paguro

Generic, Null-safe, Immutable Collections and Functional Transformations for the JVM
Other
312 stars 24 forks source link

Confusing comment in RrbTree implementation #34

Closed jafingerhut closed 2 years ago

jafingerhut commented 4 years ago

This comment in the source file RrbTree.java:

    // (MIN_NODE_LENGTH + MAX_NODE_LENGTH) / 2 should equal
    // STRICT_NODE_LENGTH so that they have the same average node size
    // to make the index interpolation easier.

confuses me. The values I see for these constants are:

 NODE_LENGTH_POW_2 5 
 STRICT_NODE_LENGTH 32 
 HALF_STRICT_NODE_LENGTH 16 
 MIN_NODE_LENGTH 22 
 MAX_NODE_LENGTH 44

Thus (MIN_NODE_LENGTH + MAX_NODE_LENGTH) / 2 is equal to (22 + 44) / 2 = 33, 1 more than STRICT_NODE_LENGTH. Perhaps this is just an out of date comment? i.e. maybe off by one is acceptable for the implementation?

GlenKPeterson commented 4 years ago

Should say, "so that they have ROUGHLY the same average node size." I didn't do a study or read about this in a paper. I just got a gut feeling that this was a good way to pick min and max node sizes to average about the same as the strict-node-length and the MAX needed to be exactly double the MIN (I think so that splitting a MAX could always yield two valid MIN and joining two min would always yield a valid MAX). I think I calculated this as MIN_NODE_LENGTH ~= STRICT_NODE_LENGTH 2/3 and MAX ~= STRICT_NODE_LENGTH 4/3. Figuring they would average to about 3/3 or STRICT_NODE_LENGTH.

GlenKPeterson commented 2 years ago

I added the word, "Roughly" and am closing this bug. Thanks for reporting it!