Closed parasyte closed 11 years ago
Even with the current patch, using me.sys.preRender
is still way faster than me.sys.dirtyRegion
. Kind of disappointing.
I believe it has to do with the number of rects + number of objects that are marked "dirty" though -- it's way too high, and drawManager.draw()
has a time complexity of O(n) where n = dirtyRects.length + dirtyObjects.length, and dirtyObjects contains every object within the viewport.
A significant improvement will come from grouping rects, and then associating objects with each rect that intersects. no intersection means no object in dirtyObjects. This will reduce n quite a bit on average. Worst case will be same as current.
Currently the engine just merge the rect corresponding to the previous position, with the rect corresponding to the new position. I've been thinking as well about merging other rect, but though it would bring performances down (though I never gave it a try).
Else, for scrolling level there is I believe two way to implement it :
Second option ("good one") won't work for parallax scrolling layers. :( That's why I'm thinking about layer clipping (culling). I imagine the general case will have about 25-50% of a parallax layer completely obscured by some layer that draws over top of it. In this case, it's not worth drawing the hidden parts at all. And with multiple parallax layers, the time savings from culling should really add up nicely.
Following my previous post on the topic, I was actually wondering if clipping could not just be simply achieved using the context global composing option. This would need to reverse the draw order, but with an option like "destination-over", it shoud work? I would think that the browser then actually do some clipping? This deservers some benchmarks :)
Else did u see the clip() function as well? I never tried it myself.
I experimented with the compositing options in my game, actually, to do the drop-shadow text in the coin counter. I found that browsers don't always agree about how certain operations should be performed. So a lot of my tests would be broken in Chrome, but working in Firefox.
However, it's still worth exploring because there is a standard that specifies exactly how each operation works, and what the results will look like.
Using the composite operation for many tasks will require creating mask shapes, so you'll need at least one hidden canvas context to work with for creating masked shapes.
clip() seems only useable with paths (vector shapes) so I don't know how well it would work on pixels. It might form visible gaps between layers as the colors at the edges get antialiased and mixed separately.
yep, except from source-over, destination-over, and a couple of other ones, the rest of them are not really supported, specially for the shadow stuff :)
else, what do you have in mind to implement clipping ? because to be honest, I also had this as a potential improvement, but I never managed to come out with a nice idea to implement it.
The only idea I had was to do some post-processing once the map is loaded to create additional layer with clipping information, but I always found this idea quite "heavy".
Other issue I found with this was when a layer was using opacity, and in theory this is even dynamically modifiable with the new renderer.
The only info needed is true or false; whether lower layers can be seen behind the tile or not. When opacity < 1, lower layers will always be seen behind the tile (so disable clipping).
I envision it will require n-1 arrays, where n is the number of layers. And each lower layer "inherits" the 'true' elements in all higher layers. In other words, each clipping map acts as a large binary number, and te binary numbers are OR'd together as the z-level lowers.
Trying to come up with a sane method of making this "clipping" work, I discovered that the current canvas spec on WHATWG supports "Path primitives", e.g. it's possible to create path objects that can be saved into a variable and reused later! According to this post, it was added around May of this year: http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2012-March/035239.html so user agent support is probably very sparse at the moment.
This could be the perfect use case: a Path object is created for each layer. Rectangles are added to this Path for every tile that can be "seen through". This path is then given to the layer below as its clipPath
. When it's time to draw the layer, context.clip(layer.clipPath)
is called to set the clipping region to the layer's clipping path. The only parts of the layer that will be drawn are the parts with "transparent pixels" in the layer above.
I think it can also be "emulated" by generating a dynamic function that builds the path using the normal context.rect
method. That function is then used in place of the Path primitive. This idea will need some more work.
Also of note is the "Hit Regions" support, which may be useful as shortcuts for mouse and touch events.
Clipping is still secondary to fixing the rectangle merging. Here are some notes I wrote while I was initially working on the patch branch. I'll try to clarify the shorthand below.
_____ _____ _____
| | | | | |
| 1 _|_|_ 2 | | 3 |
|___|X| |_|___| |_____|
| 4 |
|_____|
_____________ _____
| |X| | | |
| |X|2 | | 3 |
| 1 |X|___| |_____|
| |
|_________|
_____________ _____
| | | |
| | | 3 |
| 1 | |_____|
| |
|_____________|
I believe this is the most efficient way to merge an assortment of rectangles with unique sizes and positions. The three graphs represent three different stages of merging four small rectangles (numbered 1 - 4). The X
indicates where an overlap is detected in two rectangles. This is because only two rectangles are compared at a time. The comparison steps taken are:
You might say the worst case time complexity on this is O(n**2)
. Practically, it will be a lot lower, because each intersection removes one rectangle from the remaining set. Also, any rectangles which have been processed once should never be checked again. In this example, Rect 1 is fully processed at the end of step 5: If there were other rectangles (assume they do not overlap, for simplicity) the next iteration would start with a comparison between Rect 3 and Rect 5, then Rect 3 and Rect 6. The final iteration would compare just Rect 5 to Rect 6.
So in reality the absolute worst case is in fact O((n**2-n)/2 )
! Our example with 4 rectangles gives a worst case of 6 operations, and with 6 rectangles: 15 operations. But because of the reductions early on, we actually only perform 5 and 8 operations, respectively.
After this rectangle merging stage, we are left with only two rectangles that need to be updated, and no overlaps (this is exactly what we want).
The next phase is adjusting how dirty rectangles are associated with dirty objects. The current method can be outlined as follows:
dirtyRects = [
{32, 32, 64, 64},
{140, 233, 16, 16},
{24, 455, 24, 16}
]
dirtyObjects = [
background,
level,
player,
coin_1,
coin_2,
coin_3,
coin_4,
coin_5,
coin_6,
coin_7,
foreground,
debug
]
O(n*m) .. n = dirtyRects.length, m = dirtyObjects.length = 3 * 12 = 36
The dirtyRects shorthand is a list of {x, y, w, h} numbers. The dirtyObjects only list an object's name as shorthand. In this case, "background" is a background image, and "level" and "foreground" are TMX tile layers, but all three can be considered the same thing: very large static objects. There's also a player character, 7 animated coins, and a debug object, which can be considered a HUD.
The game engine will iterate through all dirtyObjects for each dirtyRect. Which is how we end up with 36 operations down at the bottom with our O(n*m)
notation. In practice, there is only one dirtyRect: the viewport rectangle (full screen). But we don't want to draw large portions of the static background/foreground/HUD/everything on every frame. So we use smaller rectangles, and only update the portions of the screen that change (objects move position/animate, etc.) Then we merge all of the overlapping rectangles (see above) until we are left with a minimal set of rectangles to iterate. Now we are left in the state shown above in the dirtyRects and dirtyObjects arrays; Three rectangles (a minimal/non-overlapping set) and 12 objects.
The real killer here is that n*m
. We can fix that by associating our rectangles and objects directly with one another.
If we combine the two arrays into a sort of hash table, we might end up with a structure like this:
dirtyObjects = {
background : [
{32, 32, 64, 64},
{140, 233, 16, 16},
{24, 455, 24, 16}
],
level : [
{32, 32, 64, 64},
{140, 233, 16, 16},
{24, 455, 24, 16}
],
player : [
{24, 455, 24, 16}
],
coin_3 : [
{140, 233, 16, 16}
],
foreground : [
{32, 32, 64, 64},
{140, 233, 16, 16},
{24, 455, 24, 16}
],
debug : [
{32, 32, 64, 64}
]
}
O(n) .. n = sum(each dirtyObject) = 12
O(n)
is MUCH better!
Actually, I cheated just a bit. This dirtyObjects would also have coin_1, coin_2, ... coin_7. So add 6 more operations for a total of 18. It's still a lot better than 36!
This structure gives us each object, plus the rectangles that they fall within. The three "level" layers will intersect all dirty rectangles in the typical case. Which leads right in to the next phase:
The lowest (first drawn) of the three level layers "background" can very likely be covered by parts of the upper layers, especially where the platforms and interesting parts of the level are drawn. Likewise, the highest of the layers can very likely have large empty spaces where other parts of the background (and objects) can be drawn. For each of these cases, we do not want to waste time drawing: a) A piece of the level which will be covered up by a different piece of the level. b) A completely empty (transparent) piece of the level.
The rectangles at this phase will be fairly big, so we can't effectively do anything like breaking them into smaller rectangles (each the size of a map tile) and throw away the ones we don't want. That would probably negate any potential savings. Instead, we can use the canvas clipping magic to specify areas of each layer that we KNOW will not be covered. We will KNOW about these areas because they will be pre-computed when the tiles are added to the layer(?). Or after the stage has fully loaded, and recomputed when the tile maps are changed at runtime.
I have not worked out the best way of defining these clipping regions, and I don't know for certain if they will improve performance. The best way to test that is just with plain ol' benchmarks. If the benchmarks look promising, then it will be worth exploring the idea (clipping) more thoroughly.
This sums up the current line of thinking for the dirtyRegion updates:
I just ran a simple benchmark on Chrome that used context.drawImage on a 640x640 image in a tight loop with 500,000 iterations. I used three loops:
For each loop, I recorded the time (in ms) that each takes to complete. the first two loops were always fairly constant, around 640ms. The third loop was also fairly constant, but took around 630ms. So the performance gain from using just a clipping region, even a small one, is negligible in Chrome.
It then crashed my browser, which was spending a great deal of time trying to update the window server with all of those draws.
This is sad news, but kind of expected; even when only 1/100th of the image is visible through the clipping region, it doesn't have any measurable performance impact. Canvas context clipping is out!
The other way to do clipping will only be used when the layer/global preRender
flag is false
: Each tile in the layer needs to know whether it will be covered by tiles in any upper layer, and if so, skip the tile drawing. This will be easiest to implement at first with non-parallax layers, where the tile grids are always aligned. For parallax scrolling layers, some rectangle maths might need to be employed.
Merged master into the ticket-101 branch. I'm going to scale the requirements for this ticket down quite a bit to get it ready for release by the end of November. I'll tackle the scrolling issue here, and then break off the tough algorithmic changes into separate tickets.
@parasyte
same for this one : 0.9.6 ?
Totally broken now by #176. :)
lol :)
@parasyte though it's fixed level example, I tried the dirtyRect example provided with melonJS, and it's still working :)
however there is bug linked to the image Layer, but since I have not tested this one since all the changes made in the current release, i"m not sure it's actually linked to the today changes.
@obiot I'll move this one to Future. It's going to be more work than I can put into it for 0.9.7. Will focus on #103 instead (and take out #16, #50, and #169 in the process!)
@parasyte that's perfectly fine for me ! :) I will however see on my side to at least fix it for non scrolling level, so that we do not add any regression in the process.
@parasyte my god, today I had in mind to get rid of all this code where we add the viewport position to the give rect, as it would be a very nice optimization (since we do this everytime, everywhere), like here for example : https://github.com/obiot/melonJS/blob/master/src/level/TMXRenderer.js#L77
but changing it is like opening a pandora box, and between the different things that can change the current viewport, and the broken direct rect implementation, nothing is working... I basically just spent 3 hours on this thing, with no concrete results, I'm frustrated :) :) :)
@obiot Ooooh yes. It is quite a fragile nightmare. :(
@parasyte shall we actually kill the corresponding branch ? but keep the ticket of course
Deleted!
awesome, I love to clean stuff ;)
This will be really nice to have! Also I'd like to do a quick audit to ensure the dirty rects are being grouped properly. Specifically, any overlapping rectangles should be merged into a single rectangle. So that worst case will have about the same performance characteristics as
dirtyRegion = false
.I'll also experiment with layer clipping (culling pieces that are hidden behind higher-priority layers) ... I'm unsure if it will be faster to detect where to clip (it is faster in theory), or faster to just draw everything in order (as implemented now). Clipping detection should be done on a per-tile basis; tile id "0" should never clip anything below it. Also tiles with transparent pixels should never clip. It will be interesting to see how this will all work out with parallax layers.
A few additional thoughts, while in mind:
me.debug.renderDirty
flag which will be helpful. Also consider rendering tile grid lines and highlighting tiles that intersect during clipping detections.