clintbellanger / flare

Free Libre Action Roleplaying Engine
http://clintbellanger.net/rpg/
GNU General Public License v3.0
166 stars 41 forks source link

Performance & large maps #441

Closed dorkster closed 12 years ago

dorkster commented 12 years ago

On slower systems, large maps like Frontier Plains and (to a lesser extent) Ydrakka Pass are very laggy. Right now, the map renderer has to process every tile multiple times. I don't think there's much we can do about this without messing up rendering order. So perhaps we should consider splitting up larger maps into smaller ones.

From what I've tested, 64x64 seems like the best balance of size and performance (Frontier Plains is 256x256). Some other positives of smaller maps would mean easier navigation for the player and perhaps even faster load times.

clintbellanger commented 12 years ago

There are several factors affecting map performance. We can address several of these separately.

Really, I find the Frontier Plains too big -- especially in the current game mechanic where death resets maps. I like 256x256 being a max size, feels okay for that. But when we're thinking about actual game design we can suggest that people consider keeping maps under 100x100.

There are tons of optimizations we could do; we haven't really prioritized optimizing any of the code. We can leave most of this optimization work for later though, when we know more of the core code isn't changing. Plus we can use tools to actually tell us which parts of the execution take longest, instead of guessing, and work from there.

dorkster commented 12 years ago

I like the idea of having optional tilesets without alpha data. Grasslands-based maps are the main area I have performance problems (even the smaller maps). But instead of having a separate mod, we should include it in the existing mod. Then we can:

clintbellanger commented 12 years ago

I like the idea of a texture quality option that is internal to a mod. We can look at adding this later (probably not in time for v0.16).

stefanbeller commented 12 years ago

I was just profiling flare and it turns out, that MapIso::render takes more than 40 percent of the processing power assigned to running flare. The function map_to_screen takes 20 percent and it is called very often from MapIso::render For full profiling stats with oprofile, see http://dl.dropbox.com/u/6520164/flareprofiling.txt

So I guess the comment "trim by screen rect" would be number one if it comes to optimizing for speed.

clintbellanger commented 12 years ago

I expect MapIso::render to take the most time, so that's good.

I wonder why map_to_screen is so expensive. One reason is it's being called for every tile, not just those on screen. The optimized routine would actually only need to call it once (not even once per visible tile), as it's simple addition/subtraction to figure out where the next tile belongs.

Secondarily, I'm not sure that cast to float inside map_to_screen does anything except cost more.

center_tile() could use bit shift instead of divide, if the compiler isn't already doing that.

Really, everything called by the main render loop is being hammered.

is_within() is all the collision stuff, that's pretty speedy comparatively.

clintbellanger commented 12 years ago

Stefan, I pushed an updated Utils.cpp that addresses center_tile and map_to_screen. If it's easy to run a profile again, I'm curious to see if that's made a difference.

stefanbeller commented 12 years ago

It dosn't seem to make a difference. Profiling with master branch of @clintbellanger 4f692cb7cb48e2f9153e07d1c0c78551787722a9 http://dl.dropbox.com/u/6520164/flareprofiling1.txt I did not take into account the overall performance with oprofile (CPU load), but this profiles are only the relative comparision of each function in flare itself. There is a difference to the previously posted profile, which is only one percent, but that could be related to my different actions while playing for a minute. Also I did play a longer time, that's why the numbers in the first column are larger, but the percentage as said is about to be the same.

stefanbeller commented 12 years ago

see http://dl.dropbox.com/u/6520164/flareprofiling2.txt I started on improving the isometric renderer. Only the first loop is optimised and will draw the background layer only where the screen is. (plus a few tiles to make sure the player cannot see where the renderer stops drawing.) It is work in progress, so it is not yet pullable ;) Before this change the cpu needed around 47% to run flare, now it needs 42% (which is only 90% of the initial amount of CPU needed.) These 10 % improvement can also be seen in the profiling result file.

stefanbeller commented 12 years ago

Another solution would be to pre-render the background at map loading time and at each frame there would be only blitting the right rectangle to screen. This solution would be memory hungry, but very fast.

That would not work for the objects layer as there are changes of order (player, enemies moving).

clintbellanger commented 12 years ago

Complete. Thanks all!