Should patch location/component space alias with per-vertex location/component space?

imirkin commented 7 years ago

It seems like per-patch and per-vertex varyings are largely separate beasts. However on careful reading of ARB_enhanced_locations and ARB_separate_shader_objects, it does not seem like any provision to separate the location/component spaces exists. And now a test has been added to CTS, KHR-GL45.enhanced_layouts.varying_location_aliasing_with_mixed_auxiliary_storage, which explicitly tests for an error to be detected when such an overlap occurs.

Should patch/non-patch space be totally separate? Or should it be one big happy family as far as locations/components are concerned? FWIW Mesa currently treats patch and non-patch varyings totally separately.

ghost commented 7 years ago

I am sure we've discussed this in the past, though can't remember what the reasoning was on going this route. I am sure it had something to do with the API limits - the spec required minimum would be impossible for any vendor who uses fixed, shared storage for these varyings as it would double the required storage for a given shader (probably TCS is more of a concern?).

@pdaniell-nv Do you remember where this one went? Think it might have been primarily discussed by the GL group.

pdaniell-nv commented 7 years ago

I don't remember specifically but I'm sure it's to allow implementations to use a separate bank of registers for the per-patch interface, which may have a different limit to the regular interface.

ghost commented 7 years ago

Ah you might be right.

Either way - I am confident we discussed this and came to the conclusion that the approach we have is the right one given what we have available to us - which is why the CTS issue got filed and the test developed.

An extension might be able to do something "better" but in the core spec, we have what we have.

@imirkin Does that answer your question? We could go to the groups and ask if anyone's willing to change it, but pretty sure Mesa's behaviour came up at the time and I don't see us walking back on the decision at this point - fairly sure that ship has sailed.

imirkin commented 7 years ago

On NVIDIA DX11 hardware there are separate input/output spaces, one for patch and one for vertex attributes (with shader input/output operations taking a special "patch" flag to distinguish between them). Starting with later generations, I also believe that there's been a shift towards making these spaces accessible directly via memory writes (Maxwell+),. On Adreno A4xx (and I believe A5xx as well), it's all done via memory so there are no restrictions. I'm less familiar with other hardware.

I suppose a hypothetical implementation that didn't have explicit per-patch support might emulate patch variables by just decreeing that those will be stored in a fixed vertex's locations, in which case overlap would be problematic. I don't know if such implementations exist though or are worth allowing.

Changing mesa to disallow such explicit overlap is doable, it just seemed like an odd and unnecessary restriction.

ghost commented 7 years ago

Right the point is though that we have nothing in the spec that expresses this variety of hardware very well. Not ideal, but it is what it is. In order for people to be able to write portable code, they have to make certain assumptions, this is one of them. We don't want people writing code on Mesa and having it work, only to go to another driver and having it fail with a (spec-correct!) failure. This is the essence of all negative testing done by the CTS.

If Mesa wants to advertise this as a feature, it should be drafted and submitted as an extension, because the core spec has no real way for this behaviour to be defined.

If you want to just relax the negative test, we can discuss that in the working groups, but it seems unlikely we'd change it now.

imirkin commented 7 years ago

@TobiasHector Yep, totally get it. Just wanted to be crystal clear that this is what the requirement was. Quite annoying when $IMPLA accepts things it shouldn't, and then applications tested against that impl are released, and don't work on $IMPLB. Esp if you maintain $IMPLB.

ianromanick commented 7 years ago

It's pretty clear that this should fail to compile. The GLSL spec says:

"A program will fail to link if any of the following occur....any two output variables from the same vertex or tessellation shader stage are assigned to the same location."

The part that I'm having trouble with is how the value specified in the location layout is validated. There are separate limits for per-vertex and per-patch outputs. If the values are in the same space, then all the slots can't actually be used in the presence of layout qualifiers? So, the following shader code is fine:

layout(vertices=4) out;
out float per_vertex[120][];  // Assume MAX TESS CONTROL OUTPUT COMPONENTS is 128
patch out float x; // Assume MAX TESS PATCH COMPONENTS is 120.

And this one is fine:

layout(vertices=4) out;
layout(location=1) out float per_vertex[120][]; // Assume MAX TESS CONTROL OUTPUT COMPONENTS is 128
layout(location=0) patch out float x; // Assume MAX TESS PATCH COMPONENTS is 120

But this can't compile:

layout(vertices=4) out;
layout(location=0) out float per_vertex[120][]; // Assume MAX TESS CONTROL OUTPUT COMPONENTS is 128
layout(location=121) patch out float x; // Assume MAX TESS PATCH COMPONENTS is 120

Is this correct, or I have misunderstood? The GLSL spec doesn't say which limits the locations are validated against, so I don't feel like I understand what the correct behavior is.

KhronosGroup / OpenGL-API

Should patch location/component space alias with per-vertex location/component space? #13