brianmhunt / knockout-fast-foreach

Foreach. Faster. For Knockout.
MIT License
59 stars 18 forks source link

Implement the "moved" array change type #33

Closed cervengoc closed 8 years ago

cervengoc commented 9 years ago

I opened this issue to start a discussion about a possible implementation of the scenario when one or more items got moved in the unerlying array.

In the array change objects there is a corresponding "moved" property which cross-references a "deleted" and an "added" event. The binding could possibly make use of it and instead of deleting and recreating the item, it could move the related DOM nodes simply to its new place and maintain it's internal state accordingly.

Problems what I see now:

  1. How is knockout sending the "moved" flag? I mean if there is an addition of a new item in the change list, and some "moved" changes are after it, will the "moved" index mean the new index set after the earlier addition, or the original index set.. if you know what I mean. The behaviour is not trivial when it comes to a bit more compley situation, and would need some deeper investigation, or someone's thoughts who knows knockout's array change handling algorithm more deeply (I'm surely not one of them currently).
  2. The implementation seems more simple when only one move is detected. But what about when it's more move? I mean there can be "move chains", etc. which can be tricky to track down and handle properly.
brianmhunt commented 9 years ago

Thanks @cervengoc

There is no "moved" property as far as I can see in observableArray.changeTracking.js. Where would a "moved" flag come from?

One would have to reconstruct "moves" by keeping track of items deleted that are succeeded by an add.

I'm mulling it. :)

cervengoc commented 9 years ago

@brianmhunt Sorry for not being explicit enough. You are right, there is no explicit "moved" flag, but knockout gives information about a move in an "added/removed" change-pair. Each has a "moved" property, pointing to each other's index. This way we can easily mimic the "moved" flag by simply finding these pairs and form a new "moved" change item with "sourceIndex" and "targetIndex" properties for example.

I made a quick CTRL-F and found that the logic is in ko.utils.findMovesInArrayComparison

But still, many things are not clear, at least for me. When I have time, I will prepare some test cases for myself to find the answers, but I would prefer if someone had some thoughts who already knows everything about knockout :)

brianmhunt commented 9 years ago

Thanks @cervengoc – Oh yes, right, I see what you mean now. Sorry for being so slow. :)

My initial thinking is that because we are batching our processing we may have several equivalents to moves, that may involve actual moves (i.e. resorting) and "fake" moves being actual removals followed by adds before the list is updated.

In this situation, the correct solution (whether practicable, or not) strikes me as:

  1. Insert deleted items into a WeakSet called deletedItems;
  2. For every add that occurs after its delete, remove it from deletedItems and add it to a WeakSet called movedItems;
  3. delete the items in deletedItems;
  4. reposition the DOM elements for movedItems.

This is computationally probably O(n) for the number of deletes, so it'll slow things down for big deletes. In some scenarios, for very large list deletes this might be slower than the DOM cost of removing + adding.

I am not sure offhand how best to handle the case where a single item is deleted then inserted multiple time.

I'm still mulling this, but I hope the above is helpful food for thought.

cervengoc commented 9 years ago

I'm thinking about something like this, based on your ideas:

On each queue flush:

  1. When deleting, put the deleted data item and its nodes into a temporary set.
  2. When processing additions, check the added item in the temporary store, and if we find it, reuse its nodes instead of rebinding it to new ones.
  3. After the queue flushed, clear the remaining temporary set by removing the nodes.

It would always work if the changes are coming from one array change report. For example when one uses splice, sort, or directly feeding a new array to the observableArray. These are all quite common use cases I think. Furthermore, it could still catch a lot of scenarios where changes are coming right after each other, like arr.splice(4, 0, arr.splice(10, 3)) (movind 3 items forward to index 4 from index 10).

If we could somehow hash the data items, it would be fast to check if we have it in the temporary deleted set.

I think even if it's O(n) (without hashing), it would be a huge optimization in lot of cases, especially if one has complex template with heavy widgets, etc. (like me currently).

If we can find out the basics of the algorithm I can try to implement it in the near future.

cervengoc commented 9 years ago

@brianmhunt Actually I had a few hours today and I prepared the first version of the implementation based on my idea. The simplified algorithm looks like this.

Preface. Only support move if bound array items are objects or functions, ie. not simple types like string, number or boolean. Also, make it possible to enable/disable this behaviour.

  1. On deleting, delay the node deletion for the removed items. Store the items and their nodes in a list (pending deletes) for this purpose as a record like { item, nodes, hasMoved } where hasMoved will indicate if a pending deletion was reused by an addition (a "move").
  2. When storing a the { item, nodes } record in the list, add a mark to the item with its index in this list to speed up searching for a particular item in the future. For this, use an attached technical property on the item with defineProperty and enumerable: false if possible.
  3. At addition, check if the added item has the marker, which means that it's in the pending deletes list. If yes, fetch the nodes, and simply insert them. If not, do it the standard way. If an item was found as a pending deletion, set its hasMoved to true..
  4. After every deletion and addition is enqueued, enqueue a special method for flushing those pending deletions. For each pending deletion, if the hasMoved flag is false, remove the nodes; clean the item from the marker. Finally clear the list.

I hope I didn't miss anything. What are your thoughts at first glance?

By the way, in my progress of thinking I've found a more simple, and completely robust way to handle batch additions. I'll create a separate pull request for that anyways.

brianmhunt commented 9 years ago

Thanks @cervengoc ; at first look I think it's the right idea.

One point (which you probably thought of): If a node is, in the same change-set, marked added, deleted, added, deleted, it should end up deleted. We just need a unit test for this (if there isn't one already), which will check that hasMoved is false for a node whose last change was deleted.

For a property, the arguably preferable strategy is to use a Symbol (you can create one in recent KO with ko.utils.createSymbolOrString). Symbol properties are not enumerable by default (but for legacy browsers, where a string is returned, it'll be enumerable unless we re-define the property).

Adding a property is probably better than the WeakSet method that I had proposed.

brianmhunt commented 9 years ago

p.s.

By the way, in my progress of thinking I've found a more simple, and completely robust way to handle batch additions. I'll create a separate pull request for that anyways.

:+1: ;)

cervengoc commented 9 years ago

@brianmhunt I don't think that added, deleted, added, deleted scenario is even possible with current ko array change notifications within one change set for the exact same node set. Even if I use valueWillMutate/valueHasMutated and do this mess within them, ko will figure out the actual change. Or am I wrong?

While I was planning the test cases I realized that we also should handle properly if the same data has duplicates in the array before or after the changeset. So my earlier post already has some refinements in this direction, but still the code didn't grow too much.

I'll check out the Symbol thing, to tell the truth I've never heard about it :)

cervengoc commented 9 years ago

Just to end my day in a good fashion, here is a first performance test of the implementation. The test created a not-too complex bound DOM structure (with 1 DIV and 10 children for each item). The arraz contains 1000 items with a random property, and the sort of this array was measured in Chrome.

perf_test

Only 3% of time with utilizing possible DOM moves. I know it's not always true, and there are varying scenarios, but still, I think no more hype should be needed for this :)

IanYates commented 9 years ago

Interesting discussion. I've just been lurking and reading the updates - looking forward to trying some of this stuff out soon :)

brianmhunt commented 9 years ago

Wicked. Great performance buff.

@cervengoc – you're right that the add/del/add/del scenario will not happen in one change-set. However it can happen when multiple changesets are queued and processed at once (which could be regular).

cervengoc commented 9 years ago

@brianmhunt Yes, but currently for the first round I planned to stay in one changeset scope. But as I'm thinking I don't see it very difficult to extend. My very very first idea needed the changeset scope limitation, but this doesn't. I'll see. You can expect a PR with this implementation and some new tests for it in the next days.

cervengoc commented 9 years ago

@brianmhunt I've committed my results, pleas check it out. Also, I'm still a beginner with GitHub, I have no idea how I could separate these commits to a new PR. Now it looks like it appends it to the previous PR about the "jQuery sortable" issue fix.

brianmhunt commented 9 years ago

lol, yeah, it can be a pain that way. You want to create a new branch for a new PR. --- some hints: https://help.github.com/articles/creating-a-pull-request/ –– basically for a new PR you need a new git branch.

Will look at code soon, I hope!

cervengoc commented 8 years ago

@brianmhunt Any news? Will you have time to check this out in the near future?

brianmhunt commented 8 years ago

Thanks for the ping, @cervengoc ! Sorry I missed this. Let me have a look at #34