Wargus / wargus

Importer and scripts for Warcraft II: Tides of Darkness, the expansion Beyond the Dark Portal, and Aleonas Tales
GNU General Public License v2.0
355 stars 55 forks source link

[3.1.2→3.2.0] For of war performance issue #409

Closed AMDmi3 closed 2 years ago

AMDmi3 commented 2 years ago

Describe the bug After updating to 3.2.0 I've run into severe performance regression (20%->100% CPU consumption by stratagus process and noticeable game slowdown). Turned out enhanced fog of war was the case, and switching to fast solved the issue. However, I've had fast/tiled fog of war in 3.1.2 (not sure which as they look the same), so the problem is that enhanced for was enabled out of blue.

Expected behavior fast for of war by default (or after updating from earlier version which used fast or similar mode) and/or better enchanced for of war performance.

Desktop (please complete the following information):

ipochto commented 2 years ago

Hi, Enhanced and tiled types of fog have a high CPU load (until we implement GPU rendering for them) as long as they both use alpha blending. To reduce CPU load multithreading was enabled. Have you compiled with openMP support?

AMDmi3 commented 2 years ago

It looks like yes:

-- Found OpenMP_C: -fopenmp=libomp (found version "5.0")
-- Found OpenMP_CXX: -fopenmp=libomp (found version "5.0")
-- Found OpenMP: TRUE (found version "5.0")

still it doesn't seem to use multiple threads.

Well in fact if in needs openmp just to render the fog and that would consume 90% of overall engine CPU consumption, I think it should by all means be disabled by default.

ipochto commented 2 years ago

Not only for rendering, but for fog generation too. Which executable do you use? stratagus or wargus? Wargus launches stratagus with some additional environment variables to tune openMP.

Also, can you check how many threads it uses? (htop etc.)

ipochto commented 2 years ago

Well in fact if in needs openmp just to render the fog and that would consume 90% of overall engine CPU consumption, I think it should by all means be disabled by default.

For slow machines we've left fast type of fog. Visually it's identical to tiled, but without possibility do see through fog covered unexplored areas when reveal map enabled.

But even my 12 years old notebook with enabled MT don't show any notable slowdowns for enhanced fog. What specification of your machine?

AMDmi3 commented 2 years ago

Which executable do you use?

I run wargus.

Also, can you check how many threads it uses? (htop etc.)

Here's how it looks with enhanced fog:

  PID USERNAME    PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
84449 marakasov    90    0   621M   394M CPU3     3  30:53  58.40% stratagus{stratagus}
84449 marakasov    30    0   621M   394M uwait    2   0:30  17.40% stratagus{stratagus}
84449 marakasov    30    0   621M   394M uwait    1   0:30  17.37% stratagus{stratagus}
84449 marakasov    30    0   621M   394M uwait    0   0:31  17.31% stratagus{stratagus}
84449 marakasov    -8    0   621M   394M pcmwrv   3   0:34   0.61% stratagus{SDLAudioP2}
84449 marakasov    52    0   621M   394M usem     1   0:00   0.00% stratagus{SDLTimer}
84449 marakasov    52    0   621M   394M uwait    3   0:00   0.00% stratagus{wargus:disk$1}
84449 marakasov    52    0   621M   394M uwait    2   0:00   0.00% stratagus{wargus:gdrv0}
84449 marakasov    52    0   621M   394M uwait    1   0:00   0.00% stratagus{wargus:disk$3}
84449 marakasov    52    0   621M   394M uwait    2   0:00   0.00% stratagus{wargus:disk$0}
84449 marakasov    52    0   621M   394M uwait    3   0:00   0.00% stratagus{wargus:disk$2}

with fast fog:

  PID USERNAME    PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
84449 marakasov    27    0   621M   394M CPU3     3  31:18  13.46% stratagus{stratagus}
84449 marakasov    -8    0   621M   394M pcmwrv   0   0:34   0.57% stratagus{SDLAudioP2}
84449 marakasov    20    0   621M   394M uwait    2   0:36   0.14% stratagus{stratagus}
84449 marakasov    20    0   621M   394M uwait    0   0:35   0.14% stratagus{stratagus}
84449 marakasov    20    0   621M   394M uwait    3   0:35   0.13% stratagus{stratagus}
84449 marakasov    52    0   621M   394M usem     1   0:00   0.00% stratagus{SDLTimer}
84449 marakasov    52    0   621M   394M uwait    3   0:00   0.00% stratagus{wargus:disk$1}
84449 marakasov    52    0   621M   394M uwait    2   0:00   0.00% stratagus{wargus:gdrv0}
84449 marakasov    52    0   621M   394M uwait    1   0:00   0.00% stratagus{wargus:disk$3}
84449 marakasov    52    0   621M   394M uwait    2   0:00   0.00% stratagus{wargus:disk$0}
84449 marakasov    52    0   621M   394M uwait    3   0:00   0.00% stratagus{wargus:disk$2}

For slow machines we've left fast type of fog

Yes, but I'm talking about the defaults here. The performance difference is drastical, also the fast fog is closer to the original game look, so I'd pick it as a (safe) default, with the option to switch to enhanced for bells&whistles. What worries me the most is that users will get unexpected slowdowns and CPU consumption for unapparent reason like me. Let me remind that the original game required Pentium 60 or equivalent and 16 MB RAM

What specification of your machine?

Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz (2808.10-MHz K8-class CPU), 16GB RAM, Integrated Graphics Chipset: Intel(R) HD Graphics 520

ipochto commented 2 years ago

Hmmm, do you have any of shaders enabled? Shaders are very slow on my integrated intel. Your CPU much better than mine, so the problem is not with it think.

ipochto commented 2 years ago

There's no problems to set 'fast' as default, but I wander why you have this slowdowns with so powerful CPU.

AMDmi3 commented 2 years ago

Hmmm, do you have any of shaders enabled?

None, if you're talking about these: 3

Bilinear option has additional performance effect, it's 15%/55%/85% CPU for fast/enh/enh+bilinear on the current map.

Shaders are very slow on my integrated intel.

In fact, neither of the shaders has any noticeable effect on performance for me.

It looks like I should profile it, but it can only happen after holidays in mid-January, if I have free time. Meanwhile,

no problems to set 'fast' as default

would be very nice.

ipochto commented 2 years ago

It looks like I should profile it, but it can only happen after holidays in mid-January, if I have free time. Meanwhile,

It would be ideal

timfel commented 2 years ago

Using fast fow by default now

AMDmi3 commented 2 years ago

Which reminds me that I've promised to profile it.

Here's a pprof graph of a game session where I've switched to enhanced fog and waited for some time: https://people.freebsd.org/~amdmi3/pprof61047.0.svg

It can be seen that most time is spent in openmp guts. I don't know any details about FreeBSD's openmp library, it may be in fact completely ineffective. Also no idea how to plug its debug symbols into pprof.

I've tried building without OpenMP support. It doesn't build out of box, I've had to replace omp_get_thread_num omp_get_num_threads calls with 0 and 1 respectively. Here's the graph: https://people.freebsd.org/~amdmi3/pprof22758.0.svg

I haven't played for long enough, but it feels smooth on the first glance, and takes 70-80% of a CPU, so it's probably better than OpenMP version here.