Asynchronous, but still janking on large (>2k) lists

krnlde commented 8 years ago

As mentioned here knockout/knockout/issues/1954 your fastForEach implementation has recognizable improvements over the original ko-foreach. But still I experience big janks of 1-2 sec when iterating over a list with 5k entries.

Mind the chrome timeline capture here:

The biggest part of the 2876ms frame is the purple "recalculating style". I didn't get into the "animation frame fired" JS part too deep, but it feels like it is not janking.

brianmhunt commented 8 years ago

Interesting, thanks @krnide. I'd really like to fix that – and it's precisely the kind of scenario I'd like FFE to be comfortable with.

Can you give any more info on what's happening in the recalculating part? Or perhaps share a jsFiddle/CodePen/etc?

brianmhunt commented 8 years ago

Incidentally, if you're using a <li> tag does li {display:block; } help? Per http://stackoverflow.com/questions/19796057

brianmhunt commented 8 years ago

Incidentally, at one point I was adding/removing 100,000 items from lists, without much janking. I wonder if there's been a regression.

krnlde commented 8 years ago

It is exactly your example: http://output.jsbin.com/dakihezega/2

Open the site in Chrom[e|ium]
Open its dev tools
navigate to "Timeline"
Hit Record
check "use fastForEach"
Enter 5000 in the text box
Hit add items
Wait for ~5 seconds
Stop recording
Analyze

~~Regarding your hint with the display: block: I would pretty much like to use the fastForEach in SVG, which has no display: block~~ SVG has display:block, but does not do any difference.

krnlde commented 8 years ago

btw Firefox behaves exactly the same. Same jank time.

brianmhunt commented 8 years ago

Oh yes, I see. Thanks for clarifying. :)

So the good news is that the time is linear i.e. inserting 500 items is 150ms, 5000 items is 1500ms. The problem, then, is not the insertion of items but the rendering of off-screen items (whose rendering needn't be immediate).

I'm mulling solutions, along the lines of: http://stackoverflow.com/a/12613687/19212 – basically delaying the rendering of items that are off-screen. My thinking is along the lines of, right after insertion into the DOM:

// Per http://stackoverflow.com/a/7557433/19212
function isVisible(node) {
    var rect = el.getBoundingClientRect();
    return (
        rect.top >= 0 &&
        rect.left >= 0 &&
        rect.bottom <= window.innerHeight &&
        rect.right <= window.innerWidth
    );
}

childNode.forEach(function (node) {
   if isVisible(node) {
     // leave it alone
   } else {
     // insert a vertical placeholder the same height as the node,
     // then delay insertion of the node for a few milliseconds
     // so we don't get jank.
   }
})

Does that strategy sound like it'd hit the mark?

krnlde commented 8 years ago

The idea of deferring the actual draw is exactly what I need. But how would you tell the browser not to render the node? Deferring the Node.appendChild()? My guess is, if you defer node insertions to a timeframe of ~25ms each, it would kill the jank entirely - and as a result would build up the screen part by part. This, however, could lead to unnecessary repaints of surrounding (depending) objects, but only if you are in a page flow and not positioning everything absolute, which I won't do in an SVG environment.

brianmhunt commented 8 years ago

Hmm. A simpler option might be to batch childNodes by chunks of e.g. 50.

In that case we're adding an O(n) splicing operation to the node list, but the splicing is probably way less overhead than the also-linear-time DOM node insertion.

krnlde commented 8 years ago

:+1: SGTM

brianmhunt commented 8 years ago

Thanks for reporting this @krnlde – It's an interesting use case.

I'll mull for a bit on the best way to do this. @cervengoc Do you have any thoughts on this?

krnlde commented 8 years ago

My only objection: Rather than putting the chunks in a definite frame (like you said 50) I would listen for the heartbeat of the browser. var start = Date.now() before the operation and then lookup if (Date.now() - start < 25) deferFollowingChunksToNextLoop() in every iteration. Browsers can profit from faster CPUs, while slower ones still don't jank.

brianmhunt commented 8 years ago

@krnlde I agree.

cervengoc commented 8 years ago

@brianmhunt I don't completely understand the use case and the problem on first read, and unfortunately I currently don't have time to get deeper into it.

One note: Chunking up the batch insertions shouldn't be so hard to implement. But I think for the sake of simplicity I would stay with a built-in constant chunk size instead of measuring the timing. The reason is that the code part which batches insertions is far before actually inserintg nodes. So let's say I would probably limit the batch insertions to 300, that would cover almost every usual scenarios I think.

Hope I didn't misunderstood something or wrote something completely silly :)

brianmhunt commented 8 years ago

Thanks @cervengoc – appreciate the insight. :)

krnlde commented 8 years ago

The simplicity argument is signed. But what if you have a costly DOM insertion which itself needs 300ms? Then you would have 300x300ms of blocking iterations. In case of the time measurement the browser would insert one node per loop and, yes, still jank but not for a solid 90 seconds :-P

Don't get me wrong, I'm exaggerating the consequences of course. But the time measurement is just e few characters and 1 line more than to just check if the chunk size is reached. Plus, if your CPU is capable of rendering your full (say 10k) array in one 25ms time frame, you would prevent him from doing that by applying your chunk size regardlessly and thus slowing down the computation.

cervengoc commented 8 years ago

The cost of DOM insertion can widely vary, even data item by data item based on data structure, bindings, etc. It's really hard to measure it I think.

In my personal opinion it should be very rare that someone has a template where insertion costs 300ms. Actually I wouldn't add the complexity to support it dynamically. If one has such a DOM structure, then he could set an option for chunk size explicitly.

krnlde commented 8 years ago

Not complicated at all, it's that easy:

var start = Date.now();
for (;;) {
  if (Date.now() - start < 25) appendNode();
  else return deferToNextEventLoop();
}

(pseudocode)

cervengoc commented 8 years ago

Okay, but with the batch insertion the inserted node is actually a documentFragment with several hundreds of list items. It's possible that I don't see something trivial though, but it seems not trivial for me to implement it.

If we didn't have the documentFragment utilization it would be more simple, but that optimization is key in much more often scenarios IMHO.

brianmhunt commented 8 years ago

@krnlde It's not quite that simple, as @cervengoc points out.

I think it's doable, but it'll take some rather cerebral thinking. :)

krnlde commented 8 years ago

Touché. For the ko world it might be true. That's why I'm requesting help here :)

Thank you guys for your time and your help so far.

cervengoc commented 8 years ago

By the way @brianmhunt how can it be that even when using documentFragment the browser still janks? Isn't inserting a documentFragment kind of an atomic thing?

brianmhunt commented 8 years ago

@cervengoc It totally is atomic, but as you can see from the "recalculating style" in the timeline capture @krnlde posted above, it can be a very long atomic thing. :)

cervengoc commented 8 years ago

Still I think the best way to give the ability to set a chunk size with an option. I think it' rare that a data collection is so heterogenous that one item takes 5ms and another takes 100.

The default could be to not use any chunking for keeping the performance, and developers could pass a desired value based on their knowledge about their data structure.

What do you think?

krnlde commented 8 years ago

it can be a very long atomic thing

lol. Which consequently can't be deferred to multiple event-loops. Also WebWorkers won't work here, because they don't have access to the DOM ~~afaik~~.

krnlde commented 8 years ago

The default could be to not use any chunking for keeping the performance, and developers could pass a desired value based on their knowledge about their data structure.

I can definitely commit to that.

cervengoc commented 8 years ago

Okay, so I'll start to think about it in the background but due to time reasons don't expect a solution thi year, at least not from me :)

krnlde commented 8 years ago

No probs. Contributions are always welcome, regardless what time :) Thank you!

krnlde commented 8 years ago

In the meantime, maybe I can have a look at that stuff. Can you provide some details how you would solve that around the whole insertion? Sounds like Promises are a good bet here. After a short look it appears to me that the only function depending on the node insertion is the afterAdd method which may or may not be provided through a spec. Am I right so far? That would mean we could insert the nodes completely asynchronously by first inserting the ko.virtualElements.insertAfter(containerNode, frag, insertAfterNode); (#L309) and then, step-by-step, insert the frag.appendChild(nodeOrNodeArrayToInsert[i]); on top of it asynchronously (#L307). After that we call the afterAdd method of the spec either through a Promise or directly via callbacks-style.

Also from what I can see the afterQueueFlushed method is called after each cycle and not at the end of the whole queue, which would inform listeners over a partially ready list and not over a whole list - maybe that was your guys intention so you can react on the nodes immediatelly. But then not only the name "queue flushed" is wrong but also the latter usecase is missing.

cervengoc commented 8 years ago

@krnlde Thank you for your interest and efforts. However, before looking into this I would really like to see #34 merged, because I've made major changes in internal logic which surely affects a proposed implementation of this as well. @brianmhunt will probably merge it soon :)

krnlde commented 8 years ago

I stay tuned and subscribe to that issue :)

krnlde commented 8 years ago

This sounds awfully familiar: https://github.com/knockout/knockout/issues/1965

vamp commented 8 years ago

@krnlde, do not use large lists (the better solution would be - clusterize.js or clusterize binding for knockout)

krnlde commented 8 years ago

Hey @vamp! Thank you for your answer, but why shouldn't I use large lists and let "the" library decide how to handle it. As the title suggests, a chunked, asynchronous approach could help. Also clusterize is no solution when drawing a lot of dots in one sivible SVG.

vamp commented 8 years ago

@krnlde, sample for you: https://jsfiddle.net/312j519d/

krnlde commented 8 years ago

Draw these as dots in an SVG, without scrolling and we have a deal...

krnlde commented 8 years ago

Since #34 (and #33 respectively) are closed, can we get back to this issue @brianmhunt ? I linked a knockout issue a few days back which describes a similar behaviour. Any insights?

brianmhunt commented 8 years ago

I think if we were to tackle this, the best option might be to mimic clusterize – at least the part that renders and binds the elements only when they are in the viewport.

That would address the rendering-side, though it means that the item-bindings will be lazy. This breaks the assumption that everything in the list will have bindings applied immediately, but for what'll be a massive performance boost, fair enough.

Mass-removes will still be a problem, as in #37 ...

krnlde commented 8 years ago

Clusterize, at least to me, is only a workaround for a yet to be solved problem. It doesn't nip things in the bud. We rather need a possibility to defer the browser painting over a few loops, so the browser never janks. That would resolve every problem - btw. including that one clusterize tackles. The only solution we need to find is to tell the browser to just draw some nodes and to join the asynchronous events to one synchronized one, the afterQueueFlush, but Promise is a good bet here.

brianmhunt commented 8 years ago

Thanks @krnlde – Solid input.

We have access to a couple async facilities, namely Promise, the ko.task microscheduler, and the window.animateFrame (which has some side benefits). We will have to experiment to see what works best.

krnlde commented 8 years ago

I need to get deeper into the knockout and FFE code before I can provide some experimental code. :-/ Right now the mechanics of knockout is a little hard to grasp, especially all the bindingHandler parameters. Too confusing.

cervengoc commented 8 years ago

Hello guys and first of all Happy new year :)

I had some time to work on an experimental implementation. Please check it out at my fork: https://github.com/cervengoc/knockout-fast-foreach

I'm chunking the node insertions at the "last" moment to keep the internal state flow untouched. With this delayed insertion I also had to add a check when removing nodes to ensure that the node is attached to the DOM at that moment.

I modified the afterAdd logic so that it runs for every chunk. Using this, I've prepared only one unit test so far because this behavior is somewhat hard to unit test.

Looking forward to hear the test results and your suggestions.

brianmhunt commented 8 years ago

Thanks @krnlde – yes, the mechanics are not clear. Hopefully it's something we'll improve in the not-too-distant future.

@cervengoc Your approach is what I was thinking, too.

The test you have is good. The two other scenarios we probably ought to test for are:

Adding has started, and new items are added before the all existing items are inserted into the DOM; and
Adding has started, but the items being added are deleted before being inserted into the DOM.

It's not-trivial to test this stuff.

Feel free to start a PR anytime (or just create a branch off this repo, which you should now be able to do as a collaborator).

Cheers

cervengoc commented 8 years ago

Yes, I was thinking about your second test idea, but need some time to figure out how to cover these in the testing environment.

In the meantime @krnlde could you please test my version? Now it has a chunk size of 100 built in, in the final version it will be configurable.

krnlde commented 8 years ago

At a first glance your implementation looks exactly like what I need. I'll drop it in now and test with my SVG what happens.

krnlde commented 8 years ago

Superfast! Just for the record: We are now drawing 12k dots in an SVG map without even a single jank in Firefox and Chrome.

In IE11 and Edge you can watch the browser drawing, but at least you can scroll and the browser responds during that process (with a little jank).

Wheeeee...

cervengoc commented 8 years ago

I'm glad to here the news :) so now I'll continue the tests and hope nothing has broken.

krnlde commented 8 years ago

I've created a profile for Edge during the drawing of the dots, this is what it looks like:

profile (Sorry for the german labeling)

The anonymous function callback I focused is exactly this line: FastForEach.animateFrame.call(window, function () { self.processQueue(); });

I'm not sure why Edge still janks. Maybe it is simply too slow to even draw 100 dots in a loop. But it's funny to see it being built up.

cervengoc commented 8 years ago

The processQueue method shouldn't be called so many times. Actually it should be called only once if you add the items all together (like using the valueWillMutate + ko.utils.arrayPushAll technique). Are you sure it's this anonymous method and not the other one which calls the rendering of the next chunk?

krnlde commented 8 years ago

I'm sure the first anonymous function is FastForEach.animateFrame.call(window, function () { self.processQueue(); });. I'm not 100% sure about following functions, but I'll test that again as soon as I have my environment up and running again.

cervengoc commented 8 years ago

Yes, the first should be that, and the subsequents should be the chunked rendering. I'll make the chunk size configurable later today, so you can test IE with a smaller number if you wish.

brianmhunt / knockout-fast-foreach

Asynchronous, but still janking on large (>2k) lists #35