correcly fixup tree when a removal occurs at the chunk boundery

pascalkuthe commented 2 years ago

While fuzzing #66 I found a case where the tree ends up in an invalid state after a call to remove (see test case with this PR).

fix_tree_seam incorrectly handled the case of a removal happens exactly at the chunk boundary. While the implementation did in fact consider that edge case, the detection was incorrect: It checked whether the current character was equal to the end of the child. However, in that case search_char_idx would return the next child (the end is exclusive). To fix the problem I changed fix_tree_seam to check for start_info.chart == char_idx instead (and to check the previous child instead of the next one).

cessen commented 2 years ago

Thanks for catching this!

Unless I'm misunderstanding something, it looks like there's also a refactor of the code here, in addition to the relevant change. Can you keep the change to just the minimal change to fix the issue? That would also make this easier to review.

pascalkuthe commented 2 years ago

Thanks for catching this!

Unless I'm misunderstanding something, it looks like there's also a refactor of the code here, in addition to the relevant change. Can you keep the change to just the minimal change to fix the issue? That would also make this easier to review.

I didn't really refactor anything (intentionally at least). All changes I made were strictly intended to fix the bug. It mostly looks that way because some checks now take place at different places. I can try to rework the PR to make it look more similar to the original code but I am not sure if that would work well (but its also possible I am just currently not seeing how to easily change the original code to match the behaviour in this PR). I think the diff makes the changes look larger then they actually are. If you look at the two version side by side it should be a small change really.

cessen commented 1 year ago

I didn't really refactor anything (intentionally at least). All changes I made were strictly intended to fix the bug.

Ah, got it. Your description of the bug made it sound trivial (like changing a < to a <= in a conditional check, or something like that). But I guess it's more involved.

pascalkuthe commented 1 year ago

I didn't really refactor anything (intentionally at least). All changes I made were strictly intended to fix the bug.

Ah, got it. Your description of the bug made it sound trivial (like changing a < to a <= in a conditional check, or something like that). But I guess it's more involved.

No its a bit more involved then that. I will try again at distilling the change to a minimum explanation:

The edge-case this PR covers is if a change occurs at a chunk boundary, in that case we need to call merge_distribute with the neighboring chunk. The previous implementation already covered this edge-case but the check was incorrect. The check occurs twice in this function so I just choose one example here. The old implementation performed the following check:

let (child_i, start_info) = children.search_char_idx(char_idx);
let end_info = start_info + children.info()[child_i];
if end_info.chars as usize == char_idx && (child_i + 1) < children.len() { /* handle special case */ }

This check triggers if following two conditions are true:

end_info.chars == char_idx: this means char_idx is one past the end of child_i, however if that is the case search_char_idx would already return the next child unless you are looking a the last child
check that we are not looking at the last child (which negates the case above), as merging across node boundaries is handled further up the recursion stack

This means that this condition can actually never be triggered (verified by fuzzing this function with an unreachable inside the special case).

Instead I implemented the following check:

let (child_i, start_info) = children.search_char_idx(char_idx);
if start_info.chars as usize == char_idx && child_i != 0 {

Now we check whether we are exactly at the start of the child because search_char_idx already returns the next child in the case we wanted to check above and in that case we merge with the previous child.

At the end of the function this chang really is that simple. However if you look at the start of the function there was perviously three special cases:

if we are at a chunk boundary (incorrectly checked as described above) and the current or the next child is invalid => merge_distribute with the next child
otherwise merge_distribute with the previous child if the current child or the previous child is invalid
except if we are at the first child (child_i == 0) in that case distribute with the next child

However with the fix applied we want to check the following conditions:

child_i == 0 (we can never distribute with the previous child) => distribute with the next child if the current child is invalid
if the current child is invalid distribute with the previous child
if the previous child is invalid distribute with the previous child

So the code needed to change a bit more here. Hope that clear that up :)

Btw if I squint at this a ton it might be possible to rearrange the start of the function to look more like the original code but that makes my head spin a little (and would definitely involve quite a bit of duplicate code) so maybe I am just thinking about it weirdly. So it might have been a small accidental refactor at the start of the function but it was definitely not intentional

cessen commented 1 year ago

Thanks for the explanation! Yeah, looking at the old code again, it's kind of a mess. Your new code is both easier to follow and (of course) fixes the bug.

Thanks so much!

cessen / ropey

correcly fixup tree when a removal occurs at the chunk boundery #67