ra1028 / DifferenceKit

💻 A fast and flexible O(n) difference algorithm framework for Swift collection.
https://ra1028.github.io/DifferenceKit
Apache License 2.0
3.54k stars 239 forks source link

Fix batch moving of rows in NSTableView #144

Closed tobiasjordan closed 2 years ago

tobiasjordan commented 2 years ago

Fixes #139.

A while ago I was working on a simple one-level animatable NSOutlineView (similar to the Finder's sidebar) powered by DifferenceKit, and I remember I stumbled upon issues trying to get the row move animation to work reliably for arbitrary move/delete/insert operations.

After lots of experimentation and debugging I finally found and fixed the problem which was affecting NSTableView in the same way so I decided now to take the time and share this story with you, mainly because at the time I was working on this, I didn't know there was an AppKit extension in the project here which suffers from the same problems I had.

As it turns out, NSTableView/NSOutlineView's beginUpdates() and endUpdates() methods do not really do batch updates. It rather seems like they just offer animation optimizations, as described in the method's discussion part of the Apple docs:

The main reason for doing a batch update of changes to a table view is to avoid having the table animate unnecessarily.

What that means is, in contrast to NSCollectionViews performBatchUpdates(), it inserts, deletes and moves elements immediately, updating any row index after every operation which is why the current implementation is unable to move elements to the correct positions and under certain scenarios even unable to correctly insert new elements either.

With that in mind, a potential solution was relatively simple to implement. Provided that move changes are always sorted ascendingly by their target index which is always the case for changesets generated by DifferenceKit (and potentially other diffing frameworks), here's a hopefully complete list of index issues and how to solve them:


1. Moves

Problem

As mentioned, NSTableView doesn't do batch updating so one change will immediately change element indexes, resulting in past moves messing with the source index of future moves.

Solution

Whenever an element is moved, its source index must be offset by the total number of previous moves' source indexes that were greater than (visually speaking below) itself.

In the code, that means having to collect the moved source indexes and checking how many elements with a source index greater than the current move's source index are contained. This number represents the offset to be added to the source index of each move.

Example

Source: ["D", "C", "B", "A"] Target: ["A", "B", "C", "D"] Moves: (3, 0), (2, 1), (1, 2)

NSTableView will incorrectly produce: ["A", "D", "C", "B"] because after it has moved A -> 0, subsequent moves with a source index greater than 0 (which in this example are all elements), will be off by the number of moves made above it. Here's the list of incorrect moves NSTableView does for this changeset:

1st Move: (3, 0) -> ["A", "D", "C", "B"] - OK 2nd Move: (2, 1) -> ["A", "C", "D", "B"] - source index off by one, should be (3, 1) -> ["A", "B", "D", "C"] 3rd Move: (1, 2) -> ["A", "D", "C", "B"] - source index off by two, should be (3, 2) -> ["A", "B", "C", "D"]


2. Moves with inserts

Problem

As part of DifferenceKit's 3rd stage changeset, not only are elements moved but also new ones inserted which then messes with the target index of the moves.

Solution

Applying moves after inserts is the first mistake because not only does a move to an index above an inserted element consequently move down all elements below, creating many additional unwanted moves (from which recovery would not be easily possible I guess) which change the actual target index of a move that is below the inserted element but it also changes the source index of every element to-be-moved that is below an inserted element.

So to be able to properly move elements, inserts must be deferred until after all moves were made and then processed ascendingly by the element's index to avoid scrambling with the indexes even further, which however, seems to be always the case for DifferenceKit changesets, so no extra sorting needed here.

Now that elements are inserted after the moves, the second issue must be taken care of which is that moves provided by DifferenceKit's diffing algorithm already account for the inserted indexes but since the items haven't been inserted yet and table views can't have "holes" in it, every target index of a move that is below an inserted element will be offset by the number of inserted elements above it.

To fix this, a list of all insert indexes can be created in advance which is then used during a move to determine how many elements are going to be inserted above the move's target index. This number represents the offset and must be subtracted from the target index of a move to allow for performing all move operations according to the changeset.

Example

For this example, it is already assumed that inserts are made after moves because otherwise, the correct order cannot be restored, not even by simple index adjustments.

Source: ["D", "C", "B"] Target: ["A", "B", "C", "D"] Moves: (2, 1), (1, 2) Inserts: ("A", 0)

NSTableView will incorrectly produce ["A", "D", "C", "B"] because due to the insert of A -> 0 every move's target index greater than 0 (in our case every element) will be off by one, in addition to the first move causing an offset by one for the second move, due to the mentioned first problem when moving items. Again, here's the list of incorrect moves NSTableView does for this changeset:

1st move: (2, 1) -> ["D", "B", "C"] - source index off by one, should be (3, 1) -> ["B", "D", "C"] 2nd move: (1, 2) -> ["D", "C", "B"] - source & target index off by one, should be (2, 1) -> ["B", "C", "D"] Insert: ("A", 0) -> ["A", "D", "C", "B"] - OK (will then be ["A", "B", "C", "D"] for the corrected moves)

Note that by ensuring inserts come after moves, inserted elements will already be at the correct position but without applying both offsets to every move, the order will be incorrect.


Screenshot

Here's a screenshot for comparison with the fix applied on the right. The source is a scrambled alphanumeric character set that is missing the letters B, C, F, K, U, W, numbers 1, 4, 7 and includes the emojis 😀, 😁, 😂. 😃, 😄, 😅, 😆, 😇: ["J", "😁", "E", "T", "6", "😆", "😂", "😃", "M", "I", "2", "😇", "😅", "O", "G", "H", "0", "😄", "5", "V", "Z", "D", "R", "9", "8", "3", "Q", "S", "L", "Y", "A", "X", "P", "😀", "N"]

The target is the alphanumeric character set: ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z"]

Demo-Comparison

Some notes

By using Swift's IndexSet I was able to reduce the performance overhead a bit so that in the end, this extra logic should be tolerable, considering that the biggest portion of computation time is probably being consumed by the AppKit anyway, but I haven't specifically tested that yet.

During my tests, I've frequently shuffled the emoji dataset in the sample project while also trying out datasets with random inserts, moves and deletions and I could not find any mistakes in the resulting table view. Real tests comparing the data set with the actual NSTableView cells however, would be more appropriate, of course. Note that while trying to break the table view's animations by quickly shuffling many times in a row, I was able to partially scramble the NSCollectionView's order, the table view's animations though seemed very robust to me so it looks like it's just NSCollectionView's performBatchUpdates not being able to handle too many move/insert/delete operations in a row very well.

Before I forget it, awesome project by the way, really appreciate it and thanks for reading this long story, of course! 🤓


Table


TL;DR

NSTableView doesn't support batch updating so every change (insert, delete, move) may alter its row indexes, making it difficult to apply moves generated from a StagedChangeset unless indexes are corrected accordingly and it is ensured that inserts are always performed after any moves.

paxos commented 2 years ago

This explanation is spot on 👍

I verified the fix here (✅ for 50K random cases) and this also fixes https://github.com/ra1028/DifferenceKit/issues/139

Thank you for this, this is a great fix!

martindufort commented 2 years ago

Wow @tobiasjordan 👏 👏 👏.

This explanation is the best one I've read so far on a bunch of submitted fixes. It also raises a lot of questions about why this faulty behavior (macOS - NSTableView) was not raised earlier.

Perhaps everyone has transitioned to NSCollectionView on macOS. Will await intergration into main and check this on our side.

Excellent work. 🎉

paxos commented 2 years ago

Wow @tobiasjordan 👏 👏 👏.

This explanation is the best one I've read so far on a bunch of submitted fixes. It also raises a lot of questions about why this faulty behavior (macOS - NSTableView) was not raised earlier.

Perhaps everyone has transitioned to NSCollectionView on macOS. Will await intergration into main and check this on our side.

Excellent work. 🎉

For me (and probably others) the issue was: The code works most of the time.

I had this shipped to a couple of users and only noticed it because of some edge case crashes. The only way to really nail this down was by finding (and debugging) more of those edge cases.

Super excited to get this fix in peoples hand. I am sure this will magically fix a lot of issues in a lot of macOS apps.

martindufort commented 2 years ago

FYI:

I've merged this PR into my forked repository and will be conducting another set of tests. Will let you know of the result.

Thanks

tobiasjordan commented 2 years ago

Great - thank you, @martindufort! 👍

ra1028 commented 2 years ago

https://github.com/ra1028/DifferenceKit/releases/tag/1.3.0