HaxeFlixel / flixel

Free, cross-platform 2D game engine powered by Haxe and OpenFL
https://haxeflixel.com/
MIT License
1.92k stars 427 forks source link

Optimizations!!! #61

Closed Beeblerox closed 10 years ago

Beeblerox commented 12 years ago

I need to try optimize this port.

sonygod commented 12 years ago

will you port this game to Flixel? http://wonderfl.net/c/y5Cx/ haha

Beeblerox commented 12 years ago

Are you kidding me?

Beeblerox commented 12 years ago

@elsassph recommended me: "Don't set tiles RGB if it's not really necessary for tinting. I don't think you're using .splice efficiently: you should reuse arrays and .splice out the extra items (if any). The way you say you're using it means you're reallocating the array on each iteration. Try removing scrollRects too."

Beeblerox commented 12 years ago

recommendations from previous comment didn't helped. But will be in the next commits

elsassph commented 12 years ago

Finally found the reason: when compiling from Xcode you get a "debug", considerably slower, version of your haxe-compiled code.

The Flixel BunnyMark actually runs at 60fps when you select "Archive" compilation in Xcode (ie. IPA).

Still overall Flixel is over-architectured and has many slow paths - profiling using Instruments' "Time Profiler" lets you see the most expensive methods.

Beeblerox commented 12 years ago

@elsassph thank you for this info. Will try it, but I'm using Windows.

elsassph commented 12 years ago

There are profilers on Windows too.

Either with Visual Studio Ultimate or google brings me: http://www.codersnotes.com/sleepy and http://www.softwareverify.com/

Beeblerox commented 12 years ago

@elsassph Oh, THANK YOU!!! Didn't use any profiler earlier. Will try that

elsassph commented 12 years ago

I confirm that Sleepy works nicely for timing the CPU - however it doesn't give much insights on how wasteful HaxeFlixel is regarding garbage collection.

Beeblerox commented 11 years ago

Need to work with hxcpp built-in-debugger: http://gamehaxe.com/2012/09/14/hxcpp-built-in-debugging/

Beeblerox commented 11 years ago

crazysam on NME forum wrote: "With that said, I really think Flixel is a very bloated engine, and as it stands, not very good for mobile. Optimizing the rendering to use drawTiles() was a huge step in the right direction, but the underlying update and Camera structures are very slow (calling preUpdate, update and postUpdate for every FlxBasic is wasteful since they're mostly used to update animation, and not every object needs to be animated). I'm interested in this engine (I love the FlxGroup and the recycle paradigm), so I will continue to use it and improve it, and I hope in some months it will be the best option for the small developer that wants to hit the ground running."

gamedevsam commented 11 years ago

I realize my statement on the forums wasn't very helpful. I'm very interested in getting into the HTML5 market, so I'll be running perf analysis and trying to identify bottlenecks in performance. It's likely I will be heavily modifying the core of Flixel, so I don't know how much Zaphod will want to keep and integrate into the main branch. I'll post updates when I achieve something notable.

Beeblerox commented 11 years ago

@crazysam I think that html5 could be fast and this target need really "heavy modifying": html5 needs whole new renderer based on sprites instead of blitting (I believe so), but it will make multicamera support almost impossible

gamedevsam commented 11 years ago

Multicamera support is likely one of the first things to be gutted. It doesn't make much sense for the html5 target, or for mobile platforms for that matter (I can see a potential application for tablet based games, but we can refactor it out, and leave it as an optional bonus instead of a constant perf concern).

gamedevsam commented 11 years ago

@Beeblerox I got HXCPP profiler working on my code. It's actually really nice! I can show you what I did.

Gotta figure out how to get a profile from within Android. It seems to not support fullpaths, which makes it hard to get it out of the phone.

http://gamehaxe.com/2012/09/14/hxcpp-built-in-debugging/comment-page-1/#comment-2749

Beeblerox commented 11 years ago

@crazysam I'm very interested in your results and think that profiling on pc (or mac) should give us usefull information about primary directions for optimizations

gamedevsam commented 11 years ago

Sorry I kinda forgot about this thread. Here's a simple profile of my project: http://pastie.org/5100846

FlxGame::onEnterFrame is what we care most about, and looking in there, we can see that 25% of the time is spent in FlxGame::updateSoundTray. This is very strange, it seems getting the member data of a DisplayObject is very expensive in NME. We should bring this up in the Haxe google group. I commited a simple fix to avoid updating the sound tray unless we have to.

Looking at FlxGame::step we can see that we're spending a lot of time updating the mouse, as well as the JoystickManager. We might want to make the JoystickManager a plugin. As for the mouse... I commited a change that will avoid updating the DisplayObject container unless it's visible, but we should disable it for mobile targets altogether.

I also found that Lib.getTimer was very expensive operation in onEnterFrame, I committed an optimization for that issue.

gamedevsam commented 11 years ago

Here's a profile with my optimizations: http://pastie.org/5101182

It's harder to find good spots of optimize now, Input.update() seems to be a pretty expensive operation, but it probably wouldn't be trivial or worth it to make it more efficient. Cheers!

impaler commented 11 years ago

Nice work crazysam. Yes I dont think its the best practice that HaxeFlixel uses all kinds of inputs even though the game might not require them.

As for moving JoystickManager to a plugin, it makes sense but if we decide that, it may be worth thinking about doing the same for Mouse, Touch etc. The biggest issue there might be protecting legacy code. So we could do a static var in FlxG with an override with get/set like https://github.com/Beeblerox/HaxeFlixel/blob/FlxLayer/src/org/flixel/FlxTimer.hx#L178 does with TimerManager.

If we do this for all inputs it will initialize them only if the game specifies the FlxG.joystickManager = new JoystickManager(); or even with the Keyboard etc ( something we would have to remind people with porting games ).

gamedevsam commented 11 years ago

Developing this engine for mobile with the intention of maintaining 100% backwards compatibility seems silly. I'm hoping HaxeFlixel will be an evolution of Flixel capable of targeting as many platforms as possible (with much better performance than regular Flixel), not just a simple port of Flixel to run on mobile phones. With that said, I think your suggestion of using properties to get input plugins is a very good way to handle this issue.

For now my project is very simple and the optimizations I could identify are only the parts of the engine that are slow even when nothing particularly heavy is happening. When I have dozens of sprites moving about and animating I will likely have to be more aggressive in optimizations.

Right now I don't have time to optimize Flixel input handling, so maybe Zaphod will want to look at making input Plugin based after he's done with his Texture Atlas project.

impaler commented 11 years ago

I dont disagree sam i also think that changes to the api are ok if we have good reasons for it, just we would have to explain this to game makers . The goal must also be to take flixel where as3 couldnt and yes mobile is just one part.

I'll look at separating the input stuff after the next version. It maybe more sensible to also replicate the plugin system specifically for input so we can make sure it gets updated in the stack precisely.

Beeblerox commented 11 years ago

This weekend I've made some comparisons between haxepunk and haxeflixel. Matt have done great work with optimizations of haxepunk: http://forum.haxepunk.com/index.php?topic=299.msg782#msg782 and I was interested to see results. I've used bunnymark as a test for both engines and here what i've got on my PC:

Bunnies     HaxePunk fps        HaxeFlixel fps
15k         59                  47
20k         46                  37
25k         38                  29
30k         32                  25

On mobiles results are similar. As you can see HaxePunk become much faster. And I was curious why HaxeFlixel is so slow? After some digging I found that method calls (even empty functions) are pretty expensive and HaxeFlixel has three "update" methods (preUpdate(), update() and postUpdate()) while HaxePunk has only one such method. I've tried to remove these methods calls and move their code into one and it then i saw almost the same results/fps (as in HaxePunk version of BunnyMark). So I need to think about some refactoring or change in engine's architecture (maybe merge these methods and make some of them switchable)

elsassph commented 11 years ago

These update functions definitely are the bottleneck - also don't forget that, to call these functions, the engine has to iterate over all the entities 3 times. So yes, this should be refactored, probably using a more explicit event registration mechanism.

tiagolr commented 11 years ago

aren't preUpdate() and postUpdate() called on the same update iteration? like:

FlxSprite.preUpdate() FlxSprite.update() FlxSprite.postUpdate()

If so they shouldn't cause that much burden, and anyway they may well be removed o.O

impaler commented 11 years ago

@ProG4mr I am not sure how that works either. maybe because we cant make them inline and override at the same time ?

Interesting, I hacked a raw merge of the update functions and saw; 20k Bunnies @ 45 move up to 52 fps, windows and same resolution/scale as the Haxepunk version.

https://github.com/impaler/HaxeFlixel/commit/3e475166b0b0a81410c7598eb99afb5b416cecc1

It means every update needs super and the order of the update doesn't have the same control and may have broken legacy flixel code. Mode demo seems to work however.

Of course I imagine there must be some differences elsewhere between the engines, eg their core motion code and extra things flixel does, HaxePunk seems a lot more "barebones".

tiagolr commented 11 years ago

Maybe it has to do with inlinning, from what i know the pre and post update are just 2 extra calls per update, they don't force more iterations (that's why they are supposedly bad implemented), so its strange that it cranks performance so much. Maybe i am wrong, haven't looked at code or made any tests so far, I'm just like a sports commentator ^^.

gamedevsam commented 11 years ago

When you iterate over thousands of objects every frame, every additional function call on each object will have an impact on performance.

This seems like a good change, since it's kind of redundant that we call preUpdate, and postUpdate, since they happen immediately before and after update (users could just implement their own preUpdate and call it before doing super.update(), and do postUpdate at the end of their update function).

Inlining these functions isn't an option since users are supposed to override them to add custom behaviors, and in haxe you can't override inlined functions.

impaler commented 11 years ago

Thanks sam yes its the context of iterate that elsassph mentioned that confused me. Its the cost of using 3 function calls themselves. I think my hack shows it. Sports commentors in programming we have a bigger problem :)

tiagolr commented 11 years ago

It would be very simple to remove both functions ,they are not usefull anyway, for any sports commentator that would be wise.

Anyways there are probably 3 options:

1 - Remove the functions

2 - Disguise the problem: eg: change preUpdate and postUpdate to function callbacks, and verify if they are null before calling them.

3 - Correct pre and post update:


These are presented to you by: Code Commentator.

elsassph commented 11 years ago

That's strange - if the engine only iterates once and call all 3 methods then it's kind of useless. I suggest to just leave one update method for now. Le 12 mars 2013 00:35, "TiagoLr" notifications@github.com a écrit :

It would be very simple to remove both functions ,they are not usefull anyway, for any sports commentator that would be wise.

Anyways there are probably 3 options:

1 - Remove the functions

  • performance gained
  • no effort
  • some ppl may complain
  • hxflixel "apparently" loses features

2 - Disguise the problem: eg: change preUpdate and postUpdate to function callbacks, and verify if they are null before calling them.

  • some performance gained (verifying instead of calling function)
  • few effort
  • ppl may complain a bit
  • hxFlixel stays with its "not so useful" pre and post updates

3 - Correct pre and post update:

  • using conditional comp. performance may be gained.
  • ppl may be happy
  • hxFlixel gains true pre and post updates.
  • quite some effort
  • may or not be worth it?

These are presented to you by: Code Commentator.

— Reply to this email directly or view it on GitHubhttps://github.com/Beeblerox/HaxeFlixel/issues/61#issuecomment-14750093 .

Beeblerox commented 11 years ago

@elsassph I am also choose this option

gamedevsam commented 11 years ago

Seems like impaler already made a commit for this fix. Could just create a pull request https://github.com/impaler/HaxeFlixel/commit/3e475166b0b0a81410c7598eb99afb5b416cecc1.

impaler commented 11 years ago

That was a very rough hack. Only saw the update for the particles, groups etc to see mode demo work. If Beeblerox thinks this is the best way I'll clean it up and continue with this raw merge of update functions.

It does seem more like these three functions only served as a "cosmetic" api for update afterall.

gamedevsam commented 11 years ago

I'm completely willing to sacrifice "cosmetic niceties" for performance. This might be the first big change that breaks backwards compatibility with vanilla flixel projects. Maybe its a good idea to update haxelib to v1.09 and save this change for v1.10.

sergey-miryanov commented 11 years ago

Good idea about v1.09 +1

Beeblerox commented 11 years ago

So you are proposing to release current version on haxelib and then continue the work on optimizations, bug fixes and new features?

sergey-miryanov commented 11 years ago

Yes, I propose to fix and freeze current code base, fix bugs and make new release. After it continue to work on optimizations and new features in dev and fix bugs on v1.09 (and dev).

impaler commented 11 years ago

I think there has been more than enough work on 1.09 for a release it seems it solves a lot of bug report from the forum. I suggest we release asap and make it the last release for haxe2 so we can do the next 1.10 or do a 2.x for haxe3 compliance and whatever else we work on.

Beeblerox commented 11 years ago

what bugs you want to be fixed before 1.09 release?

sergey-miryanov commented 11 years ago

I mean if you have any bugs in your todo that you want to fix before release.

Beeblerox commented 11 years ago

there are too many items in my todo list and some of them will require a lot of work (i expect so) :( but I'll release current version anyway. bug-fixing will be continued after it