GenesisFR / DS2TroubleshootingGuide

A document listing all technical issues for Dungeon Siege 2 along with solutions.
MIT License
27 stars 4 forks source link

Performance issues with Dungeon Siege 2 - Broken World #3

Open GothicIII opened 10 months ago

GothicIII commented 10 months ago

Hi again,

shameless crossposting from there: https://steamcommunity.com/app/39200/discussions/0/3809531933812050831/

Problem: The game (Dungeon Siege 2 : Broken World) runs with low fps. It often drops below 20 frames per second. Systems not affected reach easily 90+fps with the same settings. Affected systems don't have any performance problems with the vanilla version of the game.

Cause: Unknown.

I can not rule out BIOS settings since there are plenty of options to check against. It seem maybe a CPU issue. If I downclock the CPU on an affected system from 5.8Ghz to 3.4Ghz the frames go down proportional. My test laptop -which is not affected by this bug- clocks max. at 3.4Ghz and reaches 90fps.

Thesis: The game .exe wastes CPU cycles on specific hardware configurations and thus dropping the frames. Since it is not bound to a specific CPU model (e.g. Intel i13th gen is both affected and not affected) I would point to either a lowlevel BIOS/UEFI-Implementation problem (UEFI & CMS both affected) or it is a mainboard problem. I have 2 affected MSI boards. KVM is affected, too. This is an edge-case which does not affect other real world performance. Modern Applications work as fast as advertised.

Dead ends: -Debugging the application does not help. Between non-/affected systems there doesn't seem to be more/less errors pointing to the problem. Also the API call durations do not provide useful data. Since there are many calls/second and the root cause is unknown the debugging is not trivial. This also is... ...not GPU related (issue appears at least on Nvidia/Intel GPUs), the GPU is even on not affected systems barely utilized. ...not bound to a specific CPU model. ...not caused by OS settings. Even Windows XP with original .exe and disc can be affected. ...not caused by game settings. The performance is barely improved (1-4fps) between min and max settings and resolutions. ...not caused by game modifications. Unmodified DS2:BW is equally affected. ...not caused by virtualization settings. ...not caused by CPU affinity.

If requested I can provide many more details on how I approached this problem.

I am trying to solve this but I struggle hard to find anything helpful.

Current tasks to tackle: -Need a debug executable of DS2:BW like that one from the main game bundled with SiegeEditor2 (DungeonSiege2Mod.exe). If not available, any information in how to hack one is welcomed (debug console may provide performance problems). -Need knowledge how to analyse performance issues in Direct3D Applications (what frame waits on which ressources etc). Maybe injecting debug builds of d3d9.dll? -Looking for a discord group or forum etc. with more knowledgable people which are interested in that kind of problem

GenesisFR commented 6 months ago

Were you able to make any progress on this? I can't help unfortunately as I don't have the technical knowledge for that.

GothicIII commented 6 months ago

No I am sorry. I really don't know what exactly is causing it. I suspect that the BW-Engine has a bug which involves rendering a model. Loading a map in SiegeEditor v2.1 will cause weird graphical flickering on any model and that tanks fps massively (maybe the dx pipeline can't keep up!?). Loading the same map on SE v2 does result in a very smooth experience like in DS2.

I even upgraded to a Z790 board from a different brand (ASUS) and it still has the same issues. I am really lost and that low fps will ditch further when progressing through the game. Act 2 hits down to 12fps and that's barely playable. And thats with a RTX4090/i7 13th gen.

Maybe you can ask around if there is somebody left from the community.

GenesisFR commented 2 months ago

So I saw your recent post on Steam forums, could it be that the game is polling HID devices constantly just like a couple of other games?

https://www.pcgamingwiki.com/wiki/Prototype#Low_frame-rate https://www.pcgamingwiki.com/wiki/I_Am_Alive#Menu_lag_and_low_performance https://www.pcgamingwiki.com/wiki/Bullet_Witch#Heavy_stuttering_during_gameplay_or_freeze_while_loading_game_unless_a_gamepad_is_plugged_into_the_system

Should be worth looking into. Please keep me posted.

GothicIII commented 2 months ago

Thanks for your idea! Out of desperation I already tested this. I used for the testing very basic usb keyboards and mice. Since I also have an advanced virtualization environment, I could emulate old PS/2 mice/keyboards but on the the real machine I do not have this option.

It also doesn't make sense since the other guys who tested my game do not have any performance issues. They both have similar modern hardware (One has a B660/i5 12th gen and the other a x770+/Ryzen whatever) and use gaming peripherals.

GenesisFR commented 2 months ago

Oh well, so you tried all solutions? There are like 3 or 4 different ones.

Junohea commented 1 month ago

@GothicIII what were you using to try and observe/log the API calls? did you get any hardware APIs in there? when you load up a save on either machine and switch zones via a teleporter, does the framerate dip on both? and does the framerate stay at that lower value from then on?

here's some of my playing around so far: I'd only done a rudimentary search with Process Monitor to see if there was any low hanging fruit, it led to some registry keys that i didnt know about, some shown here (note that the path is not the same as what many guides/helper scripts try to set up, the keys its looking for are at HKLM\[...]\DungeonSiege2BrokenWorld not HKLM\[...]\Dungeon Siege 2 Broken World)

image

maxfps works, in-game we can get over the default FPS cap, you can see it best in the initial menu where you can run 600+ FPS if you want minfps is deceptive and will slow down the game engine to ensure the framerate reaches the minimum (you can effectively play in slow-mo in single player, in multiplayer it ignores this key)

the others you see in there dont seem to have a noticeable impact when trying different values, however... once you get in game the HKLM\SOFTWARE\WOW6432Node\Gas Powered Games\DungeonSiege2BrokenWorld\recon_controls key is queried a lot, maybe hundreds to thousands of times per second? I was hoping to patch it out of the binary to see if it impacts performance at all, but all attempts so far have led to crashes (granted I am absolutely terrible with Binary Ninja) and attempts to populate the registry key with the correct value have not been successful, I dont know what its looking for

GothicIII commented 1 month ago

@Junohea

you did well! No, I cannot intercept native hardware API. For this I'd need to write a driver and I am not capable of this task. Instead I intercepted all OS API calls and Library calls. The program to do that is API monitor in its x86 variant. This way I see everything the DS2 binary is calling. I also found those registry settings you discovered but none of them improved performance or did anything mandatory besides of the keys already known (minfps e.g.).

I am working again on it and I might have a hint. The game's fps dramatically increase/decrease by the CPU clock speed. However the CPU cores DS2 binds on never reach full utilization. Ingame the performance gets better the less enemies/npcs are in the area. The teleporting animation is rendered with full fps. The very weird thing happens when I enter the town of dryads after booting the game. The game runs with <90fps! But as soon as I enter something else and go back to town, the fps drop down to ~33. And then it stays like this.

In my observation DS2 uses QueryPerformanceCounter many times. This is a function to get a precise time stamp from the system. I don't know if it is intended behavior to get many queries per second for this kind of software. I find it very difficult to filter events that are meaningful, so I can't tell for sure. It can also be a false clue.

Theory: The game wrongly detects CPU-properties and thus cannot utilize the cores fully. Correlation between CPU frequency and fps ingame. It is really difficult to find the culprit since even my friends are not affected despite having similar hardware.

How do you inject commands directly into the binary?

Junohea commented 1 month ago

Oh awesome, thanks for the insight @GothicIII , I'll give it a try but so far I'm very much stumbling around with API Monitor as well.

How do you inject commands directly into the binary?

Using BinaryNinja to analyze the application and get it into a high-level interpreted language. From there, selecting variables/functions and either trying to NOP them, hardcode a value, or force IF statements to branch and such. You can then either execute while in the application or save it out to a new exe and launch that.

Let's look at some of the things you mentioned

  1. DS2 uses QueryPerformanceCounter many times

Used API Monitor to make a quick capture of ~24seconds of runtime:

these numbers have a lot of assumptions, but just use them as rough guideposts for the difference between how many update loops the application would have gone through in game vs not

[orange] it doesnt look like there's a significant difference between the QueryPerformanceCounter hits [purple] obviously the DungeonSiege executable itself has a lot more going on once we're no longer at the menu screen and there's actual work to do image

But when you drill into it we can see 2 calls are the only real difference, EnterCriticalSection and LeaveCriticalSection image

I don't know much about C/C++ but it seems like EnterCriticalSection restricts access to a block of code so that other threads cant access it until released. When we do a search through the application we get ~400 hits for CRITICAL_SECTION and ~300 for EnterCriticalSection. An issue here is entirely possible but would need someone with much more comfort and familiarity reverse engineering applications. I am extremely unqualified for this, haha.

  1. game's fps dramatically increase/decrease by the CPU clock speed. However the CPU cores DS2 binds on never reach full utilization.

makes sense, when i force the application to use a P or E core on the current Intel processors it has a noticeable difference. attempts to force it to use 1/2/3/4/etc cores through affinity or environment variables seems to have no effect. and yeah, the CPU and GPU are never stressed in the slightest.

  1. Ingame the performance gets better the less enemies/npcs are in the area.

saw the same, sometimes zooming in on your character helps, sometimes it doesn't.

  1. The teleporting animation is rendered with full fps. The game runs with <90fps! But as soon as I enter something else and go back to town, the fps drop down to ~33. And then it stays like this.

This was initially what set me down chasing the FPS cap, the teleport transition animation was incredibly smooth but the game itself wasn't. And whenever I'd teleport between areas the framerate went from a starting FPS of ~90 in the dryad village to ~40 elsewhere, and continuing to teleport around never saw the framerate recover.

  1. Theory: The game wrongly detects CPU-properties and thus cannot utilize the cores fully. Correlation between CPU frequency and fps ingame. It is really difficult to find the culprit since even my friends are not affected despite having similar hardware.

I was hoping that messing with affinity, CPU-count, or other similar values would show something of this nature and make it an easy fix, but no luck so far.

I'd be interested to know more about systems where you've seen the game run well, as that could rule out a problem related to potentially thread-hindering locking behaviour and more to some stupid system variance issue that might be easily fixed.

GothicIII commented 3 weeks ago

I am really struggling to find something meaningful. e.g. In Amanlu (town of elves) I could even see the breakdown of fps as soon as the tile which belongs to a new area is rendered. When it despawned (by walking towards the town) the fps increase again. This is only valid when starting the game. Porting away and back will cause constant bad fps. With a debugger I could see that threads are spawned when reaching a new area.

At least I found somewhat of a workaround. Using linux and winehq. This way you'll have somewhat stable 30 fps @ 5120x1440p. Albeit with stuttering (due to compile stuttering of d3d shaders and vulkan overhead). It is extremely difficult for a beginner to do this setup. I am an expert with unix so I'll breakdown the essential parts:

Pre-requisites:

Configuration:

I don't know how stable it is since I only tested it for a few moments. But terrains which struggled to achieve more than 20fps did appear to run better on linux. Lower resolutions also have higher fps which did not matter on Windows. Bear in mind I have a high end machine under my desk. Other more casual setups (laptop/mid range pc) may have completely different experience.

EDIT: Forget it. Unix has the same issue! Starting the game it can reach 70+ fps. Until reaching out of town where it drop to 31 fps.

Also please bear in mind, I am testing multiplayer since this is the mode I mainly play.

GenesisFR commented 3 weeks ago

Can't help man, never had that bug myself and I don't know how to do reverse engineering, debugging or profiling.

Junohea commented 2 weeks ago

@GothicIII thanks for the follow-up info. I ended up going a similar route looking to use a linux host with a windows VM and some PCI passthrough options to mess around with system variables to see if anything interesting would pop up. Sadly its all been a bust but I'm hoping to have more time to look at it down the road. I havent tried wine yet but appreciate the info there as well for when i end up giving that a shot.