HansKristian-Work / vkd3d-proton

Fork of VKD3D. Development branches for Proton's Direct3D 12 implementation.
GNU Lesser General Public License v2.1
1.75k stars 184 forks source link

Question about poor performance in Cyberpunk 2077 and Horizon Zero Dawn (d3d12) on older Nvidia cards #465

Open pppbb opened 3 years ago

pppbb commented 3 years ago

System: Arch linux 5.9.13, gtx 1070 (455.46.02), Ryzen 9, latest git build of DXVK and vkd3d-proton, wine staging 6.0rc2. vulkaninfo.txt When i run mentioned games on windows i get stable 60fps with 1080p high settings. But on linux i get below 30fps regardles the settings i use. have you any idea why this happens? is it because the driver doesnt support valve's extension?

mozo78 commented 3 years ago

I think it's the driver. You can try to install DXVK for Horizon Zero Down. When the game uses DXVK's dxgi it runs better on my end: https://imgur.com/xEvRO3p https://imgur.com/5oqNTM7 https://imgur.com/T4algdB

doitsujin commented 3 years ago

Low D3D12 performance on Nvidia Pascal (and older) GPUs is expected and likely won't improve much. The hardware has a bunch of limitations that make it very hard to extract good performance. Turing fares better, but only AMD actually runs reasonably well right now.

pppbb commented 3 years ago

thanks for reply. my old graphics card served me well for over 4 years. its time to buy a new one. could you say a few words about Ampere performance - i was going to buy rtx 3080 but waiting for better availability of cards.

WinterSnowfall commented 3 years ago

Low D3D12 performance on Nvidia Pascal (and older) GPUs is expected and likely won't improve much. The hardware has a bunch of limitations that make it very hard to extract good performance.

Very hard... but not impossible? :) I was hoping to hold onto my GTX 1080 for a while longer, but anyway, if you won't be able to find a way, I'm not sure who can.

Since I've suppressed the urge to butt into various other threads, I'll consolidate it and take a moment now, with this comment, to thank you all for the hard work you're putting into improving the general state of gaming in Linux. I'm sure you're getting a lot of grey hairs out of it, while we're reaping the benefits (do you even have the time to play any of these games any more?) - least we can do is thank you every now and then and cheer you guys on.

doitsujin commented 3 years ago

Very hard... but not impossible? :)

Might be possible, but we'd need help from Nvidia to figure out what's going on and why it's so slow. We don't even fully understand that much. And then we'd need to invest probably several weeks or months into rewriting large parts of the code base yet again to make specifically these GPUs happy, and some issues that are known to impact perf are impossible to fix given design differences between D3D12 and Vulkan as well as hardware limitations.

In other words, not going to happen for an aging GPU architecture that won't be relevant in a few years.

WinterSnowfall commented 3 years ago

but we'd need help from Nvidia

So perhaps it is impossible after all :)...

In other words, not going to happen for an aging GPU architecture that won't be relevant in a few years.

I still hope maybe luck will strike and something/anything could be done to improve things without that much overhead, but I see your point. It's a fair assessment.

You mentioned Turing fares better than Pascal and older, but, just out of curiosity, how about Ampere? Or is it unobtanium for you as well at this point?

SveSop commented 3 years ago

@WinterSnowfall You should just bite the bullet and buy an AMD card. Less "playing the blame game" there due to availability of working opensource drivers.

WinterSnowfall commented 3 years ago

@SveSop Thanks, I'm aware of the (Linux) graphics driver landscape and I recognize sound advice when I hear it. I've already decided to bite that bullet at some point. Alas, I was only curious how Ampere performs in this context and not really looking or in a position to buy anything at the moment.

delilahlah commented 3 years ago

Just reading through this and it seems like a real bummer that I may be retiring my 1080ti a year or two earlier than planned. Yeah, it's an aging card for sure, but on the other side of the coin, I'd be ditching a 3 year old flagship card.

Had Nvidia been totally silent on helping our the vkd3d team on the performance front, or is it purely down the architecture of the card?

SveSop commented 3 years ago

Had Nvidia been totally silent on helping our the vkd3d team on the performance front, or is it purely down the architecture of the card?

I think maybe it would be a lot easier with an opensource driver, cos then you (the programmers) would KNOW what the driver would do with the hardware. And possibly change stuff in the driver aswell. We have seen instances of nVidia implementing stuff in the driver to suite various vulkan bits, so it CAN happen, but it seems as a slow process, and since nVidia is in the business of selling hardware you could perhaps say it is not really in their best interest to make old cards live for longer than necessary :)

Cynical i guess, but lets face it: If you make a piece of hardware you earn money on, it IS easier to claim "hardware limitations" from said makers, so that you would run out and buy the next piece of hardware :) The proof of the claim being your situation atm - you considering retiring a perfectly usable card in favor of a "working" card. What if nVidia suddenly came up with a driver that had all the right extensions and whatever else needed by vkd3d... would you still buy it? If no, limiting stuff in (binary-blob) drivers will hasten your decision to buy new hardware and hopefully more $$ in someone's pockets. What if VK_VALVE_mutable_descriptor_type ends up only being available/usable/performant enough for the RTX30xx series and i really really wanna play Cyberpunk2077? Nothing "we" can do about that, other than jump ship to something better suited for Linux and opensource drivers.

Currently the RTX20xx series is not too bad for what i use it for (not Cyberpunk), and the availability of the new RX6xxx series GPU is nil where i live, so thats not really an option (yet).

delilahlah commented 3 years ago

Had Nvidia been totally silent on helping our the vkd3d team on the performance front, or is it purely down the architecture of the card?

I think maybe it would be a lot easier with an opensource driver, cos then you (the programmers) would KNOW what the driver would do with the hardware. And possibly change stuff in the driver aswell. We have seen instances of nVidia implementing stuff in the driver to suite various vulkan bits, so it CAN happen, but it seems as a slow process, and since nVidia is in the business of selling hardware you could perhaps say it is not really in their best interest to make old cards live for longer than necessary :)

Cynical i guess, but lets face it: If you make a piece of hardware you earn money on, it IS easier to claim "hardware limitations" from said makers, so that you would run out and buy the next piece of hardware :) The proof of the claim being your situation atm - you considering retiring a perfectly usable card in favor of a "working" card. What if nVidia suddenly came up with a driver that had all the right extensions and whatever else needed by vkd3d... would you still buy it? If no, limiting stuff in (binary-blob) drivers will hasten your decision to buy new hardware and hopefully more $$ in someone's pockets. What if VK_VALVE_mutable_descriptor_type ends up only being available/usable/performant enough for the RTX30xx series and i really really wanna play Cyberpunk2077? Nothing "we" can do about that, other than jump ship to something better suited for Linux and opensource drivers.

Currently the RTX20xx series is not too bad for what i use it for (not Cyberpunk), and the availability of the new RX6xxx series GPU is nil where i live, so thats not really an option (yet).

I have thought a lot about the possibility of Nvidia purposely crippling older hardware via drivers, but I don't think that's really in their best interest as far as selling me a new card. Put simply, it's possible, but for them to do that to a 3 year old card doesn't exactly make me want to but a new Nvidia card—quite the opposite. Now I'm thinking about pulling the trigger on a 6800XT when they become available. I was planning on waiting out this gen and buying the 3080ti. This doesn't only lose Nvidia a sale, it looses them many future sales.

Anyway, this is all speculation and for our purpose (Linux gaming), choosing a card that has open source drivers is better in the long run considering we are relying on them properly interfacing with open source projects like these.

SveSop commented 3 years ago

I did not really think about this as "crippling older hardware", cos that kinda indicates they purposfully do stuff to make older hardware worse than it is. I do not really think THAT is the case. What the problem is with closed source drivers is that outside programmers simply cannot really know how nVidia "unlocks" new features in their NEW brand of cards.

As you said this is pure speculation, and i do not really think nVidia is in the business of just re-branding old gpu's and selling them as "new" with "better driver functions" to earn money... by no means :) Let's just leave it at that.

Let us just conclude with the notion that it is easier for opensource projects like vkd3d/dxvk to optimize functions when they actually know what the drivers do AND have an easier time to influence changes in drivers to do said functions in a different/better manner if needed :)

iWeaker commented 3 years ago

Pascal does not have Async Compute engines on a hardware level, typical shady Nvidia marketing makes it sound that way but they have a solution at the "hardware" level, understand sarcasm, so that much performance is not lost when executing a game in DX12 mode, but as we are talking about VKD3D here, as it does not have Async Compute (Hardware level) it is where you see that it loses a lot of performance, since Vulkan makes use of the benefit of Async Compute, I understand that the RTX 20 series, GTX16 and RTX30 have native support in Async Compute.

K0bin commented 3 years ago

VKD3D-Proton doesn't support async compute at all. Not even on AMD GPUs.

AndreasSturm commented 3 years ago

Here some DATA of the somekind weird behaviour of the gtx 1070 while playing Horizon Zero Dawn GOG version:

System Specs: Ryzen 5 1600X, GTX 1070, Driver 460.32.03 Custom Water Cooled (important to understand temperature DATA) Distro: Kubuntu 20.10 w. Xanmod 5.10.7, Wine-Staging 6.0 RC6, used Mangohud for DATA-Collection, Lutris Runner 6.0 RC1

The GPU load is always at 100% utilization (in Mangohud and Green with Envy), but the Temps dont go above 32C, frequency is 2088MHz on the Core (normally it would be 40-44C)! It means that the GPU is utilized but in my opinion some parts of the GPU arent utilized at all. The CPU doesnt go above 10% (normally it would be 40%) which shows that the CPU is bottleneckt by the low GPU utilization (it shows 100% but that couldnt be).

Is it really a VK3D problem or is the problem the NVIDIA Driver who doesnt support VK3D properly?

SveSop commented 3 years ago

Is it really a VK3D problem or is the problem the NVIDIA Driver who doesnt support VK3D properly?

Kinda hard to say without comparison with Windows i guess. That the GPU is showing 100% utilization, while not being 40+ degrees, does not necessarily mean its "not working hard", cos that would probably depend on what workload it is doing. If the problem is low fps -> 100% utilization -> low gpu temp, then it could be some (VKD3D?) functions needs to be done, that dont really create a huge workload, but still taxes the gpu, and creates a bottleneck. Could it be missing optimizations/functions in the driver? Also an option.

I mean, look at running prime95 on all cores on the cpu with or without AVX.. World of difference in temps - same 100% load (when viewing with htop or whatnot).

WinterSnowfall commented 3 years ago

After 89fbe33 I am seeing about 10% improvement in CP2077 even on Pascal (yey!). It's now sitting at a somewhat decent 40-50 fps on average @ 1080p Medium on my GTX 1080. The only problem is than in some areas it slows down to slideshow levels (below 10fps). Is it expected to get that bad?

I would submit a trace if that were possible, but not sure if you have any Pascal hardware around to test with anyway. Just walking through the Arasaka HQ hallways on ground level would reproduce the issue (Corpo path). As far as I could tell, looking towards the entrance of the building (the hall that has the robot in the center) somehow drastically impacts framerate regardless of settings. There are other places where similar slowdowns are noticeable, but this is the earliest I've seen it in the game.

Edit: In case anyone else runs across these things, turns out disabling logging fixes the slowdowns and drastically improves frametimes. It's all just a VKD3D_DEBUG=none line away.

pppbb commented 3 years ago

Edit: In case anyone else runs across these things, turns out disabling logging fixes the slowdowns and drastically improves frametimes. It's all just a VKD3D_DEBUG=none line away.

Worked for me too. Thanks.

Joshua-Ashton commented 3 years ago

What logging were you hitting before?

pppbb commented 3 years ago

What logging were you hitting before?

In my case it was default logging. No changes by me.

Joshua-Ashton commented 3 years ago

Can you please send a log file without VKD3D_DEBUG=none

pppbb commented 3 years ago

log.txt

WinterSnowfall commented 3 years ago

Can you please send a log file without VKD3D_DEBUG=none

The slowdowns occur when the following line is repeatedly printed: fixme:d3d12_command_list_update_dynamic_state: Binding VBO with stride 9, but required alignment is 4. Oddly, they do not manifest when redirecting the output to a file (via VKD3D_LOG_FILE), only with the usual console stdout. I guess suppressing console output through other means provides a similar fix.

Here is the log I've captured: Cyberpunk2077.log - Nvidia 460.67, GTX 1080.

P.S.: Looking at pppbb's ninja logs, posted above, reminded me I also noticed the output gets truncated in console for some reason.

howdev commented 3 years ago

vkd3d is very poor performance, not just in cyberpunk mentioned here. Borderlands 3 is also slow running dx12, using dx11 is fine.

TheMachine02 commented 2 years ago

Monitoring power consumption give a small indication that nvidia card is under-utilized for a strange reason. When I try HZD on maxwell gen hardware (gtx970m), it stay at around 45W with 100% GPU utilization whereas standard game / on Windows power consumption is around 90W.

Make me wonder what is going on here, if the driver is just plainly broken and schedule badly

SveSop commented 2 years ago

@TheMachine02 Although i follow your logic, it might not be "just that easy". I mean, lets say vkd3d uses 5 function calls to do d3d12 "function X". Now, these 5 calls might not be power-hungry, but still would mean a certain % utilization. If windows D3D12 does this "function X" as a D3D function, it might be less % utilization, but more power, so it might not be directly comparable.

Just look at power consumption vs utilization on the CPU when doing AVX intensive calculations.

I am not sure you can use vkd3d on Win10, but to compare power usage when it comes to the driver - you would really need to do that, or else i would say you are comparing apples vs. pears.

WinterSnowfall commented 2 years ago

Make me wonder what is going on here, if the driver is just plainly broken and schedule badly

Same. But since Nvidia drivers are closed-source black boxes, we might never find out.

Agree with SveSop's point, but bottom line is that you're right and Pascal and earlier GPUs are underutilized. To be honest I doubt there's any interest on Nvidia side to address this though.

mirh commented 2 years ago

Yes, you can use VKD3D on windows (even 7 in fact). The problem is, it's hard to profile vulkan on anything before turing.

K0bin commented 2 years ago

The problem is, it's hard to profile vulkan on anything before turing.

It's less about needing to profile it find out why it's slow and more so that it just can't be fixed.

WinterSnowfall commented 2 years ago

It's less about needing to profile it find out why it's slow and more so that it just can't be fixed.

Which is ultimately very unfortunate :disappointed: . While I would normally agree with doitsujin's statement that Pascal is starting to show its age and will no longer be relevant in few years, if you look at the current market I expect a lot of people to be stuck with that generation of hardware for a long time to come. Albeit an insignificant amount of them will be gaming on Linux, so there's that at least, but I expect complaints will keep flowing in, regardless of their futility...

Daasin commented 2 years ago

Had Nvidia been totally silent on helping our the vkd3d team on the performance front, or is it purely down the architecture of the card?

We have seen instances of nVidia implementing stuff in the driver to suite various vulkan bits, so it CAN happen, but it seems as a slow process, and since nVidia is in the business of selling hardware you could perhaps say it is not really in their best interest to make old cards live for longer than necessary :) Cynical i guess, but lets face it: If you make a piece of hardware you earn money on, it IS easier to claim "hardware limitations" from said makers, so that you would run out and buy the next piece of hardware :) The proof of the claim being your situation atm - you considering retiring a perfectly usable card in favor of a "working" card. What if nVidia suddenly came up with a driver that had all the right extensions and whatever else needed by vkd3d... would you still buy it? If no, limiting stuff in (binary-blob) drivers will hasten your decision to buy new hardware and hopefully more $$ in someone's pockets. What if VK_VALVE_mutable_descriptor_type ends up only being available/usable/performant enough for the RTX30xx series and i really really wanna play Cyberpunk2077?

I have thought a lot about the possibility of Nvidia purposely crippling older hardware via drivers, but I don't think that's really in their best interest as far as selling me a new card. Put simply, it's possible, but for them to do that to a 3 year old card doesn't exactly make me want to but a new Nvidia card

Anyway, this is all speculation and for our purpose (Linux gaming), choosing a card that has open source drivers is better in the long run considering we are relying on them properly interfacing with open source projects like these.


Perhaps there could be some workaround for stuff like Deathloop in https://github.com/ValveSoftware/Proton/issues/5156 that doesn't involve the hardware support needed nor them open-sourcing all the drivers and tools which lets face it, we can't do. Given the shortages and NVidias reputation amongst average users for being the best in content creation with gaming. Unfortunately it does cause a lot of people to lean towards PC's with their products in,

it could be framed in the way that "This new feature support is only available on our newer cards, replace your old one to use them" So I doubt it'd really hurt their sales in that way, especially since people only be more incentivised to try and pay for new ones. I'm not say that's what happened with a tinfoil hat, only making the point that we can't assume good faith from nvidia (Same with any company that has shareholders to answer to) There are a lot of people who aren't able to upgrade past Pascal now so are stuck without Turing or Ampere :/ Still a big part of the market without full-support

@doitsujin So with that said, can't there be something to gain from working with Vulkan devs to get alternative extensions in? Even if non-bindless uniform (or other) buffers, surely there must be some kind of workaround that can be fought for rather than resigning pre-turing users to our SOL fate with modern app exclusives to DX12. Begging you, I know users often give maintainers crap who are taking their own time out to help other people but please, Pascal wasn't even that old when all this was in play and still got new stuff like the "Hardware Accelerated GPU Scheduling" :( 🙏 😭

Daasin commented 2 years ago

And is there perhaps anything in Vulkan 1.3 that you could work with them on to get a workaround extension for the lack of bindless uniform buffers? 🤞🏼🙏🏼

K0bin commented 2 years ago

Vulkan 1.3 doesn't have any new features. It just promotes a bunch of extensions to core Vulkan functionality.

Unaccounted4 commented 2 years ago

Elden Ring also uses DX12 exclusively from the look of things and the performance on my GTX 1080 is ... cinematic!

https://i.imgur.com/z5oXGVk.jpg

2560x1440 - Maximum Settings

In order to maintain 60 with just the occasional drop, I need to lower the resolution to 1600x900 and the preset to Medium.

https://i.imgur.com/Xi8608h.png

WinterSnowfall commented 2 years ago

I hadn't tested HZD in a while... but with vkd3d-proton 2.6 things have improved significantly even on my "Crapscal" :+1: . I mean it's still worse than it should be by all accounts, but at least now I can play the damn thing and not look at 15-18 fps @1080p Medium. It's more into 60fps @1080p High territories now, which is nice. There's still the issue of constant stuttering, which is a bit annoying, but I'm holding out for that shader cache :wink: .

jilv220 commented 2 years ago

For anyone using a 10xx series card under linux. U have to disable "async compute" to play cp2077 smoothly.

I had a gtx1650 card. After I disabled async compute, I can play cp2077 60fps with WINE_FSR_ENABLED=1 and 1477 x 831 resolution.

Under ../steamapps/common/Cyberpunk 2077/engine/config/platform/pc/ folder.

Create a file called "configAsync.ini".

Copy paste the following text:

[Rendering/AsyncCompute]
BuildDepthChain = false
DynamicTexture = false
Enable = false
FlattenNormals = false
HairClears = false
LutGeneration = false
RaytraceASBuild = false
SSAO = false
JJL9523 commented 2 weeks ago

Low D3D12 performance on Nvidia Pascal (and older) GPUs is expected and likely won't improve much. The hardware has a bunch of limitations that make it very hard to extract good performance. Turing fares better, but only AMD actually runs reasonably well right now.

Does this apply to the 1650 and 1660? I own a 1650 and was wondering if the performance will be slow despite it being considered a Turing card.