iXit / Mesa-3D

Please use official https://gitlab.freedesktop.org/mesa/mesa/ !
https://github.com/iXit/Mesa-3D/wiki
66 stars 13 forks source link

Skyrim NPCs causing massive frame drops. #185

Closed x09723593 closed 8 years ago

x09723593 commented 8 years ago

It doesn't happen on vanilla Wine 1.9.2, so I believe it's a D3D9 issue. Debugging environment variables also don't show anything interesting (as far as I know).

Ubuntu 15.10 kernel 4.2.0-27 Radeon R7 360 Tobago (Bonaire Pro) sarnex's Wine 1.9.2 PPA oibaf's graphics drivers PPA

I've managed to narrow it down to heads/faces causing the issue since it doesn't happen when NPCs are wearing full-faced items, or are dead.

If there's anything I can do to get more information about the issue please let me know.

sarnex commented 8 years ago

Hi, can you please take an apitrace of both wine and nine? Instructions are here at the bottom https://wiki.ixit.cz/d3d9_debugging Make sure to set WINEDLLOVERRIDES correctly.

Come to #d3d9 on freenode and ping me, we have a server for apitrace files and I'll give you the information

Let me know if you have any questions

axeldavy commented 8 years ago

you can display fps with GALLIUM_HUD=fps

I replayed the trace on both tonga and hd7730m

Yes it is slower when the camera looks the npc (~2 times), but it isn't that shocking, because when it isn't looking at npc, it is looking at wall, with nothing hard to render, whereas when it looks at npc, there is some light effects (fire, etc).

Do you get some more slowdowns that these fps cut by two ?

I checked in game, and with my configuration I don't have any slowdown with npcs.

x09723593 commented 8 years ago

It's not as apparent with one or two NPCs, but when there's many it's very obvious. It doesn't happen when they die or are wearing a helmet and tends to get worse when they're talking or interacting with something. It also has an effect when going into third person.

From what I can tell this doesn't happen, or is a lot less apparent using wine-devel v1.9.2

I've turned down all the settings and resolution when testing, but at high settings it runs at 50 - 60 with Nine, even with low overhead lighting mods.

The small drop you see in the second image on the right was when someone walked right in front of me.

Ran with WINEDEBUG=-all GALLIUM_HUD=fps

NPCs No NPCs

axeldavy commented 8 years ago

I definitely don't have such slowdowns when playing.

Can you display cpu usage as well ? GALLIUM_HUD=fps,cpu0+cpu1+etc (the number of cpus you have)

x09723593 commented 8 years ago

AMD X4 860k @ 3.7 GHz.

FPS tends to shoot up once they stop fighting for a moment, then drops back down.

axeldavy commented 8 years ago

Looking at the graphs, it seriously look like you are cpu limited when you get these slow downs.

axeldavy commented 8 years ago

Can you show similar graph with wine instead of nine ?

axeldavy commented 8 years ago

Do you run Skyrim with wine 32 bits ? If not it could help to.

x09723593 commented 8 years ago

Looks like it might be a CPU issue, but even when the CPU isn't at 100% the FPS is low. For the record, the regular Wine version I'm using is wine-devel from https://launchpad.net/~wine/+archive/ubuntu/wine-builds both versions are 1.9.2.

Skyrim launcher crashes when I uncheck Gallium Nine, and TESV.exe gets this error. fixme:module:load_dll Loader redirect from L"wined3d.dll" to L"wined3d-csmt.dll" err:module:import_dll Library wined3d.dll (which is needed by L"C:\\windows\\system32\\d3d9.dll") not found

Running with WINEARCH=win32 did not make a difference.

Wine + Nine

Wine

axeldavy commented 8 years ago

"Running with WINEARCH=win32 did not make a difference." -> obviously, because you need to use it only when creating new prefix, else it has no effect (yes you'll need reinstall the game - do copy paste)

"fixme:module:load_dll Loader redirect from L"wined3d.dll" to L"wined3d-csmt.dll"" old csmt install not cleaned apparently. You can manually clean it with regedit, but I don't remember how.

EoD commented 8 years ago

FTR: https://youtu.be/VvUwOYHabDA on AMD TONGA (DRM 3.1.0, LLVM 3.8.0, Mesa 11.2.0-devel)

x09723593 commented 8 years ago

As far as I know I created the Wine prefix as 32-bit, but I could try it again. There is also no syswow64 folder, if that makes any difference.

axeldavy commented 8 years ago

Is nine slower than normal wine when there are npcs, or ?

x09723593 commented 8 years ago

Yes. There is a slight drop around NPCs with normal Wine, but it's not nearly as bad as Nine.

Just tried 32 and 64 bit prefixes and neither made a difference.

axeldavy commented 8 years ago

I checked the Locks with apitrace, and there seems to be normal - well performing - behaviour when looking at the wall and bad -very innefficient- behaviour when looking at the people.

No idea why they would not do the efficient way all the time, but perhaps wine handles that innefficieny differently which drives better performance.

axeldavy commented 8 years ago

I read somewhere (not in msdn documentation) something that could imply that managed vertex buffers get ... optimizations. The game is using the normal efficient way of dynamic vertex buffers, but also writeonly managed vertex buffers (and lock them with no particular flags).

Potentially this case can be better optimized by nine, but I'd like to get more informations on the expected 'optimizations'.

axeldavy commented 8 years ago

We found the innefficiency and are on it.

x09723593 commented 8 years ago

Great to hear. Thank you!

axeldavy commented 8 years ago

Some of my thoughts here, if siro reads, so he can give his opinion:

I checked nvidia and ati doc (the latter was a bit wrong btw) on the behaviour (and checked d3d ddi api as well). What is sure is that . when a managed vertex buffer is used at draw call, the vram copy is updated (not before). . that a special function is used for that copy, which uploads a range of the ram buffer to vram (so similar to lock with discard_range + memcpy).

Now some possible optimizations, and it is not clear what is done by the windows runtime: . [No Optimization]: maintain a min/max for the area that gets locked by the user. Upload the entire [min, max] range in one go. . [Opt 1]: remember all the subparts locked by the user, and upload for all subparts (so avoid uploading unlocked parts between two locks). => This is likely pretty useless Optimization, which game would do such locks ? Still if the runtime does that, some game could rely on it, and these games would be hit if we don't do that. => One way to test is to do two locks of different regions, and write outside the locked bound between the two locked regions, and see if it affects the rendering. . [Opt 2]: Have a copy of the ram copy, and compare byte/byte find the first and last locations where something changed, in order to reduce the size uploaded. => Unlikely, because it seems memory extensive and cpu expensive, even though it saves bandwith. Games that do lock but don't really change content a lot would get better performance (do some game really do that ?). . [Opt 3]: Perhaps the runtime does upload only the intersection of the area needed for the current rendering operation needing the buffer, with the dirty region area. It could be very well possible, and thus if another draw call needs another area, there would be new upload for that area. This could help performance if game locks entire buffer per frame, but actually needs a very small part of it for the frame.

These could be combined together. I plan to implement the [No Optimization] case, because it seems the best relative to cpu usage and memory usage. Opt 1-2-3 could help bad games get better performance by reducing bandwith, but likely the gains for these is not worth the additional cpu time spent for the good behaving games.

Does that seem good/correct to you siro ?

siro20 commented 8 years ago

I'm fine with "No Optimization" and "Opt 3". Rule of thumb is "upload as much as possible" as PCIe transfers are fast, but have a lot overhead.

axeldavy commented 8 years ago

Please test master, I implemented the discussed behaviour.

I'm not sure the problem is fixed.

axeldavy commented 8 years ago

Ok, on my hd7730m while playing the game, I was actually able to see difference with a lot of npc (same scene 22 fps vs 30 fps after the patch).

I consider this bug fixed then. Reopen if not.