LTCHIPS / rottexpr

A Rise Of The Triad Source Port with additional gameplay options and more...
GNU General Public License v2.0
97 stars 26 forks source link

rottexpr use 100% of the CPU #12

Open ghost opened 6 years ago

ghost commented 6 years ago

I've used the Linux kernel tool perf to analyze why did rottexpr used so much CPU.

It turns out, the greadier functions are GetTicCount() and CalcTics().

Aside from that, when displaying the menu, DrawFilmPost() and DrawMenuPost() comes second. I suspect them to redraw the screen all the time, even when not necessary.

When playing the game, DrawSkyPost(), DrawRow() and R_DrawWallColumn() comes in second.

LTCHIPS commented 6 years ago

Interesting, what resolution were you running at? I've found that rottexpr doesn't run well at 4k resolution, even with an i5 4690k clocked at 4.5GHZ, if that adds to your investigation.

I just tried playing for a bit, and according to task manager on Windows 10, rottexpr didn't reach any higher than 15% CPU usage at 1600x900. I'll see if my VMs struggle with high CPU usage as well.

On Wed, Jul 25, 2018 at 5:17 AM, Marc-Alexandre Espiaut < notifications@github.com> wrote:

I've used the Linux kernel tool perf to analyze why did rottexpr used so much CPU.

It turns out, the greadier functions are GetTicCount() and CalcTics().

Aside from that, when displaying the menu, DrawFilmPost() and DrawMenuPost() comes second. I suspect them to redraw the screen all the time, even when not necessary.

When playing the game, DrawSkyPost(), DrawRow() and R_DrawWallColumn() comes in second.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/LTCHIPS/rottexpr/issues/12, or mute the thread https://github.com/notifications/unsubscribe-auth/AQSOHP2g9Tfl6K6pAuGGOyJfjEKc675Tks5uKDeYgaJpZM4VfuR5 .

-- Thanks, Steven LeVesque

ghost commented 6 years ago

I was running the game at 320x200 in windowed mode.

I don't know much about profiling tools on Windows.

perf is a Linux kernel tools that let you see, and even record, which functions does consume the most CPU power.

ghost commented 6 years ago

Sorry, I've closed the issue by mistake.

ghost commented 6 years ago

I think I have fixed it with a very simple workaround. I'm writing a patch right now.

ghost commented 6 years ago

@LTCHIPS please try the fix in the performance_fix branch. It may fix your issue for good.

LTCHIPS commented 6 years ago

28.45% total CPU usage reported by perf on my Ubuntu VM with your fix, 34.87% total CPU usage reported w/o the fix. Was playing at 320x200.

I tested it at higher resolutions (800x600 and beyond), and noticed that the game ran a bit choppy. I switched out VBLCOUNTER for 60 for the g_sleeptime calculation, and the gameplay was smoother, but the screen rotation effect was all screwed up. I did this for both windows and my ubuntu VM.

The CPU usage was still lower either way compared to what's currently in the master branch.

On Wed, Jul 25, 2018 at 6:20 PM, Marc-Alexandre Espiaut < notifications@github.com> wrote:

@LTCHIPS https://github.com/LTCHIPS please try the fix in the performance_fix branch. It may fix your issue for good.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/LTCHIPS/rottexpr/issues/12#issuecomment-407914982, or mute the thread https://github.com/notifications/unsubscribe-auth/AQSOHMtsXdgm3oph_oZNoL41yHT9Z6p7ks5uKO8RgaJpZM4VfuR5 .

-- Thanks, Steven LeVesque

ghost commented 6 years ago

You are right. VBLCOUNTER acts as a frame-rate setting. I've tried also setting it to 60 to enjoy a 60fps gameplay, but RotT's way of handling time should be entirely redesigned in order for some routines not to be so Tick-dependant.

If you put VBLCOUNTER to 60, a lot of things will go wrong, like:

An interesting fact it that I've identified two things that are independent of VBLCOUNTER:

As for the gameplay being choppy under Linux, I would really consider running it in a real environment, and not in a VM, to be sure it's not the VM who makes more complicated.

After more calculation, it seems that the while loop that I've removed does indeed waits for ~28500 microseconds. Depending on how I take things into account, I get results between 27500 and 28500 microseconds, which only makes 0.001 second of difference. So my calculation g_sleeptime = 1000000 / VBLCOUNTER; is correct.

And what about 4k gaming on Windows? Is is smoother now?