Open unknownbrackets opened 6 years ago
It's a shame GL_AMD_depth_clamp_separate is not more widely supported outside AMD. It seems like that would allow this to work perfectly. It seems like desktop GL 3.2+ clamping works the same as Vulkan.
I'm not sure if we could use a separate varying for depth - maybe it's an option but wouldn't that deform triangles AND break early z tests?
-[Unknown]
Added note above that Driver 76 is affected by this: it uses minz=00000, maxz=0012c, depth clamp=on.
We could change the heuristic to gstate.getDepthRangeMin() == 0 && gstate.getDepthRangeMax() == 65535
which would at least help that game, but probably trading bugs in other games (might be a good trade, not sure.)
OpenGL draws it wrong too, but because it doesn't even really handle minz/maxz at all (skips "accurate depth".) If we used depth clamping on desktop GL3+, it would have this same bug.
-[Unknown]
I guess the Right Way to implement this is using a geometry shader. It should happen before clipping, so we can clamp and let the rest be clipped. This of course wouldn't work on anything below ES 3.1, or in most cases ES 3.2.
In theory, we could generate up to 64 triangles per input. And more devices support geometry shaders than depth clamping, as far as I can tell...
The other benefit is, we could implement culling here, which might work more consistently..
-[Unknown]
I'm back with more Pursuit Force issues :wink: This time it only happens in D3D11 and Vulkan so I'm pretty sure this is the right issue to report that.
First of all, take a look at these gifs:
I'm holding the stick down and at the 23th frame, part of the helicopter (I'm assuming) covers the whole scene (minus the HUD).
GE dump: UCES00019.ppdmp at 0x09799B7C (1109/1348)
P.S. I can move this to #11216 if you want.
Edit: Welp, I reread the thread and it seems to be this again. Something between 06340bfa9..025e478d2 happend which fixed the yellow roadblock (and most importantly the Lost the Vulkan device! error) but did not fix this unfortunately.
@unknownbrackets FYI, the purposed workaround does fix my issue as well. Can we get that into master until a proper fix is created?
Edit: Just checked master, it's already done but with an OR instead? Why's that?
https://github.com/hrydgard/ppsspp/blob/a9302c49f0cd3edb03996bd6d9467a6dbc31c430/GPU/Vulkan/StateMappingVulkan.cpp#L264
Well, &&
and ||
are both wrong, it's just a question of which is wrong in more cases. Since you're trying to fix a specific issue, it's obvious there are cases where the ||
is wrong, but won't know where the &&
is wrong without testing it out in a bunch of games.
I think if we changed master to &&
, there are some games which would show a black screen instead of videos, have triangle cutouts in some scenes with missing background parts, etc. But I'm not sure the extent of problems it would cause.
To state it more clearly:
||
means, if we need to clamp either side, do it. It views clamping as the less risky operation.&&
means, only clamp if both sides need it. It views clamping as the more risky operation.The danger of not clamping something that needs to be clamped is that it won't be drawn or parts of it will not be. This could be missing UI elements, videos, objects, parts of scenery, etc. That's the danger of &&
.
The danger of clamping incorrectly (the danger of ||
) is that objects may be deformed. For example, if I'm drawing a tree, it may be squashed or something if it's clamped incorrectly. But they may also cover the screen like in your example. It's because the wrong side is being clamped too.
I think the only way to not be wrong is a geometry shader as mentioned above. That would do the clamping manually, working around Vulkan's lack of support.
-[Unknown]
I'm not sure, but possibly https://github.com/KhronosGroup/Vulkan-Docs/blob/master/appendices/VK_EXT_depth_range_unrestricted.txt would let us do more of this in the fragment shader instead. It's unfortunately less widely supported than geometry shaders.
Geometry shader support is actually appearing on many recent mobile devices (and nearly all desktop devices), so might finally be worth implementing a path for that. I think we are seldomly vertex limited so even though geometry shaders are known to be pretty slow it might be okay.
EDIT: I missed that you already posted this above unknown! Had completely forgot about it. Still, a good reminder!
It would in fact not deform triangles I believe, though yes of course, early z test breaks.
Old post:
I was recently made aware of the following trick, which lets us arbitrarily emulate clamping on any hardware that can write to Z in the fragment shader, which is pretty much everything we support,
https://stackoverflow.com/questions/5960757/how-to-emulate-gl-depth-clamp-nv
Since you say you're using OpenGL ES, but also mentioned trying to clamp gl_FragDepth, I'm assuming you're using OpenGL ES 2.0, so here's a shader trick: You can emulate ARB_depth_clamp by using a separate varying for the z-component. Vertex Shader:
varying float z;
void main()
{
gl_Position = ftransform();
// transform z to window coordinates
z = gl_Position.z / gl_Position.w;
z = (gl_DepthRange.diff * z + gl_DepthRange.near + gl_DepthRange.far) * 0.5;
// prevent automatic z-clipping between vertex and pixel shader
gl_Position.z = 0.0;
}
Fragment shader:
varying float z;
void main()
{
gl_FragColor = vec4(vec3(z), 1.0); // or whatever
gl_FragDepth = clamp(z, 0.0, 1.0); // can replace with only clamping in one direction, etc.
}
Now, writing to Z has the unfortunate side effect of turning off depth buffer optimization on a lot of hardware, but still better than rendering wrong. We should be able to use this to emulate everything the PSP does - although it won't be easy to get it exactly right.
@hrydgard Do you know why D11/VK is different from OGL in this case? Is it a limitation of VK and/or any different in D12? Solving this with shaders (or at all really) would be nice but what's the root cause here?
It would in fact not deform triangles I believe, though yes of course, early z test breaks.
Depending on the values of z and w, for a non-flat triangle (i.e. where its vertices have different z values), wouldn't forcing z to a static (flat) value in the vertex shader deform the triangle? Well, maybe nothing after projection can matter for z?
hrydgard Do you know why D11/VK is different from OGL in this case? Is it a limitation of VK and/or any different in D12? Solving this with shaders (or at all really) would be nice but what's the root cause here?
We have two ways of handling depth: the original way, and the more accurate way. They both have problems. OpenGL uses the original way by default. The original way doesn't work in other APIs because they expect depth ranges to be strictly ordered.
In the original (GL) method:
That implementation is wrong in a bunch of ways, but it accidentally gets some things right.
For example, imagine you and I were both trying to emulate "x = a b". My formula for this ends up actually being "x = a b + b". And your formula is the simpler "x = a + b".
Now, if a and b were both 2, I'd calculate 6. Darn it. And you, even though you think multiplication and addition are the same thing, would get 4 - the correct value. This might seem to prove to you that, indeed, multiplication and addition are identical operations. Instead, you just got lucky with the values of a and b.
If a and b were both 100, I'd get 10100, which is only 1% off. But you'd get 200, which is wildly wrong.
The fact is that games most often use values that both formulas get right, or close enough. And maybe just as many use ones that are wrong in the one as wrong in the other. It's annoying.
We could enable "accurate depth" on GL. However, we don't support depth clamping there (and many implementations don't either), so we use a fallback way of emulating depth clamping there, which is almost a third way. If you imagine depth as a line:
\/ values clamped to [ \/ clamp to ]
[========================================]
^ 0.0 ^ 1.0
That's real PSP depth, which always performs any clamping, if enabled, to [0, 1] (something none of our graphics backends can do accurately.) Any value outside the bar gets clamped when enabled.
When we don't have native depth clamping, the workaround we use is this:
\/ clamp to [ \/ clamp to ]
---------------[==========]---------------
^ clipped (lost) ^ 0.0 ^ 1.0 ^ clipped (lost)
So we scale down the bar. We lose values outside the bar (we would no matter what, that's because native depth clamping is not available), but we reserve some of the hardware's bar to fake clamping.
We try our best to treat the values below 0.0 the same as 0.0, and the values above 1.0 the same as 1.0. Effectively, this means if a depth value is within 150% of the normal depth range, it'll "clamp" correctly. Values outside this range will be clipped, which is wrong.
Since I'm illustrating, what this issue (#11399) is about is how native clamping interacts with clipping. PSP clamping works like this:
\/ clip \/ minz clip \/ clamp to ]
[XXXXXX==================================]
^ 0.0 ^ 1.0
Basically, clamping doesn't matter on the side you enable clipping. It still clamps the other side. X here represents clipped values.
But here's what we end up with:
\/ clamp to [ \/ clamp to ]
[==================================]
^ 0.0 ^ 1.0
Oops. We're just clamping everything. There's no clipping happening at all. It works fine if both sides clip, but hardware depth clipping/clamping don't let us have both sides work in this scenario. We either clip both sides, or clamp both sides. The PSP lets games control those independently, effectively.
I'm pretty sure we could sacrifice early z tests (potentially on a lot of draws, which would probably motivate a compat ini flag) to dodge this, i.e. by clipping with discard or finagling w to avoid the deform. I do still think a geometry shader could avoid that, i.e. by producing triangles clipped properly, which could then safely use hardware depth clamp.
See #6660 for more detail. See https://github.com/hrydgard/ppsspp/issues/6660#issuecomment-172677831 specifically.
-[Unknown]
That's a really great summary, thank you! I think I'm gonna copy paste it into GPUStateUtils.cpp or maybe a broken out DepthUtils.cpp one, if you don't mind?
The problem with geometry shaders is that on mobile it's only supported on the latest couple of generations, and ARM strongly recommends not using them. On the other hand our vertex counts are usually not that high compared to modern games so might not be an issue.
I actually think the mentioned technique might work without distorting triangles, without any further finagling. Let's think about how the coordinates of gl_Position are used.
Would have to try it though to be sure :) And the use of discard would make it even worse on mobile chips so geoshaders might still win.
I think that might work... just worried about the number of games that use depth clamp kinda all the time, and the early depth tests.
That said, I had another idea that might be able to preserve early Z tests and fix this issue: gl_ClipDistance (combined with depth clamp.)
So suppose:
gl_ClipDistance[0]
to the z value, or 1.0 - z
.0.0
in our example is already 0.5
.For draws where early depth tests would apply, they should still apply (provided they still work with depth clamp enabled on the device.) This would just add an additional clip plane, effectively re-enabling Z clipping.
I think Direct3D 11 has this too for SV_ClipDistance (as a vertex out.)
Potentially we could combine this with the non-depth-clamp example described above, too. If we mess with z, we could still apply user clipping on one side to clip appropriately.
-[Unknown]
That's an interesting idea. Unfortunately, gl_ClipDistance is another thing that's not universally supported - according to vulkan.gpuinfo.org, ARM GPUs seems to have a maxClipDistances of 0.
Doesn't mean the idea isn't useful, of course - could still be an alternative on other GPUs.
I think this can be closed as of https://github.com/hrydgard/ppsspp/pull/14833.
Well, that doesn't work everywhere, and also depth is still being clamped incorrectly. That change only alters some of the more visible effects of this, but the ASCII art I did above is still the case.
-[Unknown]
How about this now after https://github.com/hrydgard/ppsspp/commit/3ff400e40e122dcba771d32b8a064a3b42e7e371
It should be fixed for devices that support clip distance, which not all do. There's also a note that Test Drive Unlimited is still having problems in #16069.
-[Unknown]
Based on #11260 and #11393, the behavior of Vulkan's depth clamping is a bit wrong.
Affected games:
On the PSP, you've got the following depth parameters:
[0, 65535.5)
(after viewport) causes portions of triangles outside depth to flatten against the near/far plane. Otherwise, such triangles are culled.Importantly: depth clamp always clamps to the full possible range of Z, regardless of viewport or clipping. Clipping happens after, so if you clip to a narrower range like [10000, 50000], then clamping does nothing useful.
Vulkan does not work the same way. Vulkan clamps depth to the depth clipping range. This is probably causing the depth clamp part of #11216, and also causes an issue in #11260, as noted in #11393.
The simple workaround in #11393 is just to disable clamping when both minz and maxz are set. The troublesome case is when only one direction is clamped, and the other is clipped.
When Vulkan doesn't support depth clamp, we use a "buffer" area around depth instead. This works, but only to a point: after the buffer is exhausted, depth clamp doesn't work. This causes #9545 in Direct3D 11 (and Vulkan without depth clamp.)
To use depth clamp successfully in this case, we'd need to:
-[Unknown]