CreateJS / EaselJS

The Easel Javascript library provides a full, hierarchical display list, a core interaction model, and helper classes to make working with the HTML5 Canvas element much easier.
http://createjs.com/
MIT License
8.11k stars 1.97k forks source link

Improvement proposal for StageGL: avoid copying whole array on bufferSubData #984

Open Nestorferrando opened 5 years ago

Nestorferrando commented 5 years ago

Hi, I noticed that when many small batches are rendered (for example, if I apply a lot of additive or shader effects), the performance is reduced greatly, because bufferSubData copies the whole arrays, causing performance drops.

By creating shallow subarrays with the batch size, the performance improves for scenes with lots of graphics and effects such as additive blending.

This is how I implemented it in my fork:.

https://github.com/Nestorferrando/EaselJS/pull/7/files

DavidHGillen commented 5 years ago

Creating shallower batches for high churn situations is definitely one of the secret performance boosts you can do. I've had thoughts about writing up blogs about advanced tips like that!

In terms of properly integrating a permanent acceleration to the library though. The problem is avoiding accidentally creating massive amounts of memory churn when adding in a system like that. It was the reason I defaulted to large flat static buffer and had hoped to make it relatively easy to resize (but haven't actually made/tested any APIs to do so yet) per project.

With the addition of the new immediate drawing modes an idea is occurring to me, to have an immediate drawing buffer stored and sized separately. That way the immediate draws (most of the blend modes are 'immediate draws') can be updating and uploading a micro sized buffer rather than borrowing their full sized brother. That should allow for tiny buffer updates in the most common scenario.

I don't have a timeline for making those changes (easier buffer max adjustment / micro buffer) but thanks for making me re-evaluate my initial assumption now the library has grown :) I'll leave this open as a form of TODO.

Nestorferrando commented 5 years ago

Thanks to think about it. Just to picture how relevant the performance change is, let me show you our test case:

image

this is a slot game, using StageGL, we have some self-made scissors masking and a lot of additive blending for all the glows.

With default DEFAULT_MAX_BATCH_SIZE the processing time is about 50 ms per frame

image

with a reduced DEFAULT_MAX_BATCH_SIZE to 1024 we reduce the processing to about 20 ms

image

by using subarrays we improve it a little bit more.

image

DavidHGillen commented 5 years ago

@Nestorferrando The latest push https://github.com/CreateJS/EaselJS/commit/10fefbdc8b7cb434e6a392a3b3f59e2ee1f14b66 contains a way to lower the maximum buffer size for a StageGL instance new createjs.StageGL(canvas, {bufferSize: 200}.

To figure out what value to use you can either guess and check or try out the WebGLInspector utility functions createjs.WebGLInspector.replaceRenderBatchCall(stage, createjs.WebGLInspector.trackMaxBatchDraw); use your app for a bit, be sure to test your high use situations, then open up the console and type createjs.WebGLInspector.__lastHighest

The batch size currently defaults to the max of 10920, but you can set it as low as 170 so try it out and let me know how that helps your performance.

I still need to make the separate buffer for immediate draws though so this isn't all done yet.

Nestorferrando commented 5 years ago

Hi! this is a great first step, the performance is much better now, not only because this avoid copying the whole array but also because it prevents another weird behavior we just found on some android devices:

We just discovered that on some mobile devices, rendering a unique big batch has worse performance than rendering several smaller ones:

Some devices that show this behavior:

Samsung Galaxy J3, Samsung Galaxy S6,

I found a post on stackoverflow of another person that also experienced this behavior:

https://stackoverflow.com/questions/16639931/why-is-webgl-render-speed-so-inconsistent

Lowering default batch size to 170 makes also this problem to disappear.

At this point, I think that the performance we are getting is quite good: we have about 30 fps on low-mid range phones.