iXit / wine-nine-standalone

Build Gallium Nine support on top of an existing WINE installation
GNU Lesser General Public License v2.1
272 stars 23 forks source link

r500 hardware: Don't expose full “NPOT Textures” support in Gallium Nine #133

Open lorn10 opened 2 years ago

lorn10 commented 2 years ago

My "brainstorming" around bug #132 (which was in the end r300 driver related and corrected soon by @ondracka) :wink: led me to the following finding.

It looks that we have another feature called NPOT Textures which should be restricted in Gallium Nine for r500 and all pre-DX10 class ATI/AMD hardware (in fact the r300 Mesa driver). This is most likely also true for the corresponding NVIDIA GPUs of that era like the NV30 and NV40 GPU series. However, regarding the later one I am not absolutely sure.

All those pre-DX10 type GPU comes usually just with a limited support for NPOT Textures.

The following blog page of Aras Pranckevičius shows what should be done regarding this in Gallium Nine. I quote:

Things are quite simple here. D3DCAPS9.TextureCaps has capability bits, D3DPTEXTURECAPS_POW2 and D3DPTEXTURECAPS_NONPOW2CONDITIONAL both being off indicates full support for NPOT texture sizes. When both D3DPTEXTURECAPS_POW2 and D3DPTEXTURECAPS_NONPOW2CONDITIONAL bits are on, then you have limited NPOT support.

I’ve no idea what would it mean if NONPOW2CONDITIONAL bit is set, but POW2 bit is not.

Hardware wise, limited NPOT has been generally available since 2002-2004, and full NPOT since 2006 or so.

So regarding the r500 and generally all pre-DX10 class ATI/AMD hardware both flags should be set to "on". As mentioned, this seems to be true also for the Nvidia NV30 GPU range. However, I have no clue about NV40 but most likely it also doesn't support full NPOT.

I assume that the default setting is simply always "off" because Gallium Nine was mainly designed for newer DX10+ hardware. :wink:

Addition: ATI/AMD and Nvidia are "lying" about the NPOT support in GL and GLES on DX9 hardware. Effectively there exist only support (in hardware) for the limited variant of NPOT, everything other is emulated in software.

axeldavy commented 2 years ago

Right, I think you are right about setting both these flags for these old cards (ideally based on a gallium flag). Though maybe this won't be enough. What about the volume flag ?

lorn10 commented 2 years ago

Maybe @ondracka has some more detailed ideas how to realize this at best. He is already fixing the r300 black rendering problem (#134) and maybe he has the time and motivation to add also a "NPOT Textures" correction for older hardware in Gallium Nine.

ondracka commented 2 years ago

Actually I won't be looking into this. There is a huge pile of more pressing r300 bugs so I'm not particularly motivated to go chasing hypothetical problems. If I understand it correctly this issue is based solely on the nine code inspection, but you are actually not aware of any app where the NPOT texture handling really leads to some specific rendering issues with r300 driver?

lorn10 commented 2 years ago

I agree, this is effectively more a "hypothetical thing". :wink: So it has no priority.

However, according to the information available, and if you are following the "play book", then there should only be exposed a "limited NPOT Textures" support for old DX9 class hardware because it simply has only support for that.

And yes, I really tried the game "A Hat in Time" (GOG Version) also on my RV530 based iMac5,1 computer. I hoped to provoke there somehow a "NPOT texture situation".

But I failed dramatically right at the beginning. This game seems to be "monstrous", it is simply too heavy for my old iMac5,1. It was even an exercise to get it working at my iMac12,2 computer. Maybe this is one of the most complex D3D9 games ever produced. It gave me some strange errors at the CLI and then it totally messed up my Wine prefix. Luckily I made a copy of the prefix :+1: More information can be found at my comment here.

Whatever, maybe Axel Davy can put this on a nine "ToDo list". And if it's too complex to implement, then it can be left as it is.

axeldavy commented 1 year ago

@lorn10 I've some code that should fix r500's support of 256 constants (Last 3 commits of https://github.com/iXit/Mesa-3D/commits/master) in case you want to test.

mirh commented 1 year ago

https://www.khronos.org/opengl/wiki/NPOT_Texture#Older_hardware https://forum.dxgl.info/viewtopic.php?t=13

ondracka commented 1 year ago

@lorn10 I've some code that should fix r500's support of 256 constants (Last 3 commits of https://github.com/iXit/Mesa-3D/commits/master) in case you want to test.

I did a quick testing and now we fail to compile pretty much any vertex shader using relative addressing. The issue with r500 hardware is that not only constants but also immediates (I'm using the TGSI nomenclature here, as I don't know much about DX) must fit into the 256 constants limit. And what sucks the most is that for vertex shaders we don't have any inlining options (we can inline 1, 0, -1 using the constant swizzles and thats it). To be honest I have no clue how this is supposed to work (how this works on Windows). I'd be happy to implement/enhance stuff on the driver side, but I really have no idea. As I see it, there are two options:

axeldavy commented 1 year ago

@ondracka Is there an option to inline the offset for relative addressing ?

The difference indeed with this series is that when relative addressing is used, all 256 constants will be declared as used, thus leaving no space for the immediates. Something that nine does is that it reads the immediate if it has them, and write the immediate value into the constant buffer for it to work with relative addressing. We could instead always read the constant buffer, not the immediate, for r500. The question is whether that will be enough for the few immediate that the d3d asm -> TGSI translation generates.

axeldavy commented 1 year ago

I think if you can work with constructing the few immediates that are generated by the asm conversion (-1, 0, 1, 0.5, 2., If I checked correctly), and the integer offset of relative addressing, then everything could fit.

EDIT: Actually this is more complicated than that. Many of these constants are not in paths that can trigger when we use relative addressing. However there are a few immediates that are used to enforce d3d9 behaviour for a few things that I bet r500 doesn't need. For example the a0 clamping eats a few immediates, as well as the pointsize clamping. We'll need to continue this discussion elsewhere.