Open giuppe opened 11 months ago
first thing I notice, the bunnies in the first image are rotated slightly, where they are all upright in the second, this might be preventing batch drawing.
Also how are you rendering the openfl standalone test? Edit: Oh, i see. i forgot openfl had a bunnymark demo
Edit 2: The other difference i see is the UI overlay in Flixel, does hiding that improve performance? I doubt it will
the bunnies in the first image are rotated slightly
yes, it's the initial angle, I forgot to disable it. It seems to make a small difference (+1000 bunnies):
Oh, i see. i forgot openfl had a bunnymark demo
Exactly, I just did openfl create BunnyMark
and then openfl test windows
.
Without the UI we gain another 1000 bunnies:
(I also used the openfl.display.FPS
object to avoid using FlxText).
Another 1000 bunnies by disabling the background:
In any case, even with these changes, the fps counter goes under 60 at 25000 bunnies.
yeah the difference is still massive, and it's worth doing a deep dive into this, thanks for checking those loose ends though!
Is the openfl test using Bitmap instances, are they both ending up with some gl-batch draw? I think flixel is rendering sprites to a Graphics buffer, I've always wondered if that could be omitted
OpenFL BunnyMark is doing this, removing unnecessary code:
public function new(){
// ... other initializations ...
var bitmapData = Assets.getBitmapData ("assets/wabbit_alpha.png");
tileset = new Tileset (bitmapData);
tileset.addRect (bitmapData.rect);
// ...
indices = new Vector<Int> ();
transforms = new Vector<Float> ();
}
private function addBunny ():Void {
var bunny = new Bunny ();
bunny.x = 0;
bunny.y = 0;
bunny.speedX = Math.random () * 5;
bunny.speedY = (Math.random () * 5) - 2.5;
bunnies.push (bunny);
indices.push (bunny.id);
transforms.push (0);
transforms.push (0);
}
private function stage_onEnterFrame (event:Event):Void {
for (i in 0...bunnies.length) {
// ... recalculates bunnies position ...
transforms[i * 2] = bunny.x;
transforms[i * 2 + 1] = bunny.y;
}
graphics.clear ();
graphics.beginFill (0xFFFFFF);
graphics.drawRect (0, 0, stage.stageWidth, stage.stageHeight);
graphics.beginBitmapFill (tileset.bitmapData, null, false);
graphics.drawQuads (tileset.rectData, indices, transforms);
}
So it's doing a single drawQuads
call per frame, with all the bunnies. It can do this because all the bunnies have the same bitmapData.
I was wondering if this is giving it those insane numbers, and so I put the drawQuads code in the bunnies loop (something more similar to having a different bitmapData for each sprite):
private function stage_onEnterFrame (event:Event):Void {
graphics.clear ();
graphics.beginFill (0xFFFFFF);
for (i in 0...bunnies.length) {
var transforms = new Vector<Float>();
var indices = new Vector<Int>();
// ... recalculates bunnies position ...
transforms.push(bunny.x);
transforms.push(bunny.y);
indices.push(0);
graphics.drawRect (bunny.x, bunny.y, tileset.rectData[2], tileset.rectData[3]);
graphics.beginBitmapFill (tileset.bitmapData, null, false);
graphics.drawQuads (tileset.rectData, indices, transforms);
}
}
AFAIK this is more like how Flixel's FlxCamera.render()
works: each item has its own drawQuad().
and the result is:
If I'm understanding this correctly (and barring errors in my implementation), this would mean that drawQuads
is itself relatively slow and Flixel is already optimizing it a lot...
OR shaderFill
is faster than bitmapFill
, I'll try.
I changed bitmapFill
to shaderFill
in OpenFL's BunnyMark and it almost doubles the number of bunnies:
But still far from FlxBunnyMark 25k bunnies
Seems related to https://github.com/HaxeFlixel/flixel/issues/3005
I ran both OpenFL's BunnyMark and Flixel's FlxBunnyMark on the same machine, adding bunnies until the fps dropped under 60. Windows-HXCPP-release. For FlxBunnymark, the options were: No Collisions/No shaders/Step: Variable/On-Screen. I also disabled the
angularVelocity
because the bunnies in the OpenFL version are not rotating.27k bunnies for Flixel+OpenFL (at 52fps) vs. 240k bunnies for OpenFL (at 54fps).
Please forgive me if there is something obvious that I'm missing: both benchmarks are "official", I'm using the default configuration for each project, so I'm guessing they are doing the best they can to show the correct numbers. OpenFL's BunnyMark is using drawQuads, too.
Using the Visual Studio profiler there seems to be no discernible bottlenecks (as in: there are no obvious inefficiencies in Flixel that are eating the cpu time), just, the end result is that it's taking a lot more time to render.
Is Flixel unwittingly creating more work than it should for the rendering pipeline? Or is OpenFL missing some optimizations that overwhelmingly affect Flixel?
I know that there is more than raw performance to appreciate, but still: it's the same rendering engine, the numbers shouldn't be so different.
How can we approach the issue to gather more data, maybe find the root cause? Does anyone have any pointers?