clintbellanger / flare

Free Libre Action Roleplaying Engine
http://clintbellanger.net/rpg/
GNU General Public License v3.0
166 stars 41 forks source link

Label incorrectly reused in GameStateLoad #935

Closed igorko closed 11 years ago

igorko commented 11 years ago

After adding FPS value to game window it became easy for every user to evaluate game performance. And I have noticed, that on my one core Athlon 2.2Ghz using windowed mode 800x600 I get large speed loss in GameLoad menu. it works even slower than Frontier Plains with bunch of goblins near you! When there is one game slot, FPS is 29-30 (it's ok). But 4 gameslots decrease FPS to 20, while Frontier Plains with bunch of goblins near you still give 28-30. Looks weird. Seems savegame preview works very slowly. This also is confirmed by playing external music(larger/smaller lags). It takes a lot of CPU time.

clintbellanger commented 11 years ago

@stefanbeller Can you create 4 characters and then post your profiler output of the Load Game screen?

clintbellanger commented 11 years ago

If I had to guess, I'd say the text rendering is killing performance. We're reusing one label to display most of the text on that screen, and that's not how labels are meant to work. Labels cache rendered text images. If we don't have separate labels for each thing on the menu, it's losing the cache each time. So every single frame it's having to render all that text. We should have a name, level, and map label for each of the four slots.

Especially if you're using non-latin text, which was giving me terrible performance before I added that label buffering. I could never figure out why e.g. Cyrillic text rendered vastly slower than the plain English alphabet characters. It might be something internal to SDL_ttf. Or it might be some bad way we're passing around strings (e.g. not using const parameters so we're making copies of copies of copies of unicode strings).

If there are other menus that misuse labels this way, it should be corrected:

stefanbeller commented 11 years ago

I guess @clintbellanger is right, when saying the text rendering is killing the performance. see https://dl.dropbox.com/u/6520164/flareAtLoadingScreen.txt As oprofile is not supported and not maintained any more I switched to the perf tool, so the output is a little different. My interpretation of the data in the given link: The application 'flare' (That's what I profiled) eats up only 39.71% percent itself, and the rest of the percentages go into libSDL, kernel calls , libc++ and all the otehr libs.

But when zooming into the flare code, 97.99% of that (time?) is spent in FontEngine::render(), which is 98.48% delivered from GameStateLoad::render() and 1.52% from GameSwitcher::showFPS(int)

dorkster commented 11 years ago

The only situation in which I've noticed a performance drop when lots of enemies (corpses included) are on screen. A good way to see the effects of this is to hang out in the Ancient Temple room with 5 zombie spawners. My 2.8ghz Pentium 4 drops to 24 fps after about 150 bodies are piled up.

I know it may seem obvious, but we should think about limits of spawned enemies and of corpses. And we should think about supporting "low quality textures" (read: 1-bit alpha surfaces) for sprites like we do for tilesets.

stefanbeller commented 11 years ago

Regarding the enemies: I would refactor the way Renderables are handled, so that there is not the creation of a new renderable for each enamy, npc and hero each tick, but each of the enemies, npc and hero should have a renderable attached, which exchanges only values each tick. Then the Maprenderer can keep a list of the renderables as they don't change over time often (on new spawns only.)

And as of now death is pretty permanent, so once an enemy is dead, we could prerender it. In case of animated tiles disabled these dead bodies could be blit into the background layer, so actually no overhead is created each tick, but only when recreating the background layer.

clintbellanger commented 11 years ago

Blitting the corpses is going to cause more slowdown than allocating those Renderables. But I do like the idea of a Renderables list that works smarter -- it can be a collection of pointers to Renderables instead, and each e.g. Enemy object can have a local Renderable. Then GameStatePlay can add/remove Renderables instead of just adding to a fresh list each frame.

We could have a corpse timeout. E.g. a corpse disappears after 1 minute. Really, the entire Enemy object can be deallocated after that.