hrydgard / ppsspp

A PSP emulator for Android, Windows, Mac and Linux, written in C++. Want to contribute? Join us on Discord at https://discord.gg/5NJB6dD or just send pull requests / issues. For discussion use the forums at forums.ppsspp.org.
https://www.ppsspp.org
Other
10.83k stars 2.13k forks source link

LocoRoco2 Tropuca 1 rendering issue with Direct3D near the end #12058

Open nagisa opened 5 years ago

nagisa commented 5 years ago

What happens?

Direct3D 9/11:

directx

What should happen?

OpenGL:

opengl

What hardware, operating system, and PPSSPP version? On desktop, GPU matters for graphical issues.

Radeon R5 230, Windows 10, PPSSPP v1.8.0.

hrydgard commented 5 years ago

So missing ground, missing loco, and strange colors? Or are the colors normal?

nagisa commented 5 years ago

From what I can tell this is a z-index issue or something similar. Somewhere further along the level after the "underwater" part ends (and thus the underwater background is cut off), you can see the missing layers (containing ground and the loco) get drawn from under the same cut-off. With that in mind I would bet that colours are fine – there will probably be some sort of colour blending going on.

This is the only segment and the only level so far where I could observe an issue like this.

Hope the explanation makes sense.

nagisa commented 5 years ago

I can try to make a screenshot of the cut-off later.

nagisa commented 5 years ago

Cutoff pictured. This cutoff, its inclination, etc move around smoothly. I thin I’ve seen in a few cases a corner of this "layer" as well.

Cut-off pictured

hrydgard commented 5 years ago

That is really quite odd :) Thanks for reporting this. Which world and level is this?

nagisa commented 5 years ago

Not sure what you mean by "world", level is Tropuca 1.

hrydgard commented 5 years ago

Oh, right, was thinking LocoRoco 1.

unknownbrackets commented 5 years ago

Could you try exporting a GE frame dump? These help a lot.

See here for instructions - it's not hard and works on Android too: https://github.com/hrydgard/ppsspp/wiki/How-to-create-a-frame-dump

You can zip that and then drag and drop it into a reply here.

-[Unknown]

nagisa commented 5 years ago

UCES01059_0001.ppdmp.gz

unknownbrackets commented 5 years ago

The missing draw is at 234/244, depth settings:

The depth of 1.0 is written at 111/244:

It seems like this particular draw should not happen / not be clamped? If it's skipped, the red area shows as expected.

This is the correct PSP output: UCES01059_#12058_locoroco_depth

-[Unknown]

Sanaki commented 2 years ago

This is easier to verify at the start of Tropuca 2, seemingly emitting from the hermit crab thing circling in the path after the two corals you push past. Unfortunately, I just tested this with both standalone and the libretro core, with both vulkan and OpenGL, and am seeing the issue across the board now. Tested on x86_64 Linux with f89d5b75a5f6cfcee91c3341f4be9a391e44367a. Tested the Windows libretro core via wine with OpenGL as well, same result.

Having bisected the issue, OpenGL started exhibiting the same buggy behavior in df6abe83a379ab5bc2802e2d838262d4cfdbcede. Vulkan seemingly always did. As of that commit, no renderer is capable of working correctly in these areas.

unknownbrackets commented 2 years ago

I was looking at this again and found an interesting typo in some of the tests I used for depth clamp - although I remember checking some games to validate this behavior as well. It was a simple, but stupid mistake: one of the coord's z was not being set correctly (rather, I set another coord twice.)

Will validate more, but this might explain some of the z clamp behavior from a few issues.

-[Unknown]

unknownbrackets commented 2 years ago

Okay, so the behavior that would fix this (and I suspect others) is that some additional triangles are culled, when:

My previous tests on this were meant to be flat, but were incorrectly not flat.

The interesting thing is how "Z outside" is defined, because it's irrespective of viewport Z scale or center. Instead, after applying view, model, and projection - it's actually down to the bits.

Any Z value with greater magnitude than 0x3F8000FF (i.e. where its 24-bit truncation would be greater than 1.0) is considered outside. Same for negative. It could even check (value & 0x7FFFFF) >= 0x3F8000, since this only happens for that exponent value.

The next obvious question is: can the PSP tell post-transform Z apart from 0x3F800000 and 0x3F8000FF? That's a difference of 0.000030517578125. To answer that, I crafted large viewport parameters and used the following Z values with identity matrices:

That's a pretty clear result to me; this might even explain some of the depth rounding trickiness we've seen. I assume once the matrices are applied, the result is normalized to something that maintains only 15 mantissa bits (excluding the implied.) Obviously, this makes some sense given we know most GE registers are 24-bit floats.

But this doesn't seem to control the actual clamping. Using the same test as above, if I pass a triangle with 2 vertices of 0x3F7FFE00 (within Z) and the third 0x3F8FF100 (1.124542, well outside), then it writes a depth of 0x01F4. This means it didn't clamp that third Z before viewport, but did it afterward.

If depth clamp is disabled, primitives get discarded if even one vertex is outside Z, and importantly this seems to follow the same rules as above for "outside Z." Again, viewport doesn't matter. This may explain some of the times we incorrectly rangecull when we shouldn't?

-[Unknown]

unknownbrackets commented 2 years ago

Additional testing: it does seem like negative Z (only) is clipped, but notably this is different from the above culling behavior.

That is, only when the depth clamp flag is on, the portion of a triangle with pre-viewport Z < -1 (after w, but before viewport) will be clipped. This doesn't happen on the positive side (i.e. Z = 12 is still displayed), and seems to happen regardless of whether the vertex is clamped.

This also only occurs for triangles. A rectangle will not be clipped on negative Z at all (another easy-to-get-wrong bit.)

For example, let's say we changed viewport Z scale from the typical 32767.5f to 3276.75f. Then we used triangles to draw Z=10 to Z=-10 from top to bottom. This would produce 65535 at the top, but stop where Z=-1 would be, even though that would only be 29490 or so.

This happens without regard to viewport or minz/maxz register values.

If depth clamp is off, the entire triangle would be drawn (no clipping) in the above example.

-[Unknown]

hrydgard commented 2 years ago

Really interesting with all this pre-viewport behavior!

Just realized that there's another way to cull triangles, which might be perfect: shaderCullDistance. https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/gl_CullDistance.xhtml

This unfortunately does not seem to be supported by Mali, but other than that it's widely supported in Vulkan at least.

unknownbrackets commented 2 years ago

Ah, I'd looked at this and clip before for guardband, but it's true this might be a good way to implement the cull without a geometry shader, given "Primitives whose vertices all have a negative clip (sic) distance for plane i will be discarded." It doesn't help as much for guardband / clamp off for the same reason, though.

-[Unknown]

unknownbrackets commented 2 years ago

Update on above: lines are also clipped based on Z (again, rectangles are not.)

For guardband handling, I had done some of my tests (and observations from games) based on the mistaken assumption that post-viewport Z was what mattered and clearly that was wrong. Speciifcally, I'd just played with viewport parameters to validate negative/positive, which in hindsight was pretty stupid of me.

Now that I'm testing properly, I'm seeing:

So in other words, it's just as you'd expect: triangles avoid guardband culling entirely based on being clipped, which only occurs on the negative Z side (pre-viewport.) Based on my review of the dumps in the linked issues, I'm pretty sure this would fix most of the issues with guardband culling, and some of the issues with depth clamp.

Considering implementing this with cull distances (2, pos and neg) and clip distances (1), and I think we're pretty much always going to have 3 of those combined, outside Mali. But for Mali, it's trickier to come up with a good solution. Doing clipping using depth range might be problematic for correct depth values (probably could get it working, though), but I guess cull would just leave Mali out for now?

For the non-clamp case (no Z clip, and even one vert culls), could continue using the NAN strategy.

But haven't had time yet to implement the guardband rules in softgpu yet, which I intend as the first step to validate it.

-[Unknown]

hrydgard commented 2 years ago

Very nice to confirm all this! And of course softgpu will be an even better confirmation.

Not stupid at all to think that post-viewport was what mattered, that's how all modern hardware works and what I always assumed as well. Kinda makes me think someone accidentally inserted the clipper at the wrong position in the pipeline...

Rectangles are treated as constant-Z I believe, so they cannot cross the near Z plane and get clipped, so yeah, no avoiding the guardband...

At least modern Mali does support geometry shaders (even though their optimization manual discourages them) and geometry overload is rarely the problem for PPSSPP, so that might actually be a workable solution for Mali.. But absolutely fine to leave out for now, I can take care of that path later, I have plenty of Mali devices to test on.

unknownbrackets commented 2 years ago

Rectangles are treated as constant-Z I believe, so they cannot cross the near Z plane and get clipped, so yeah, no avoiding the guardband...

Yes, but interesting note: the culling behavior inside the guardband pays attention to both Zs, i.e. if the TL Z is inside [-1, 1] and the BR Z is outside [-1, 1] then it won't cull the rectangle, but if both TL and BR are outside in the same direction, it will. So clearly, that happens before it treats the rectangle as flat Z. Just trying to avoid assumptions...

-[Unknown]

hrydgard commented 2 years ago

Huh, that is somewhat surprising! Maybe they're just going through the line hardware path right up until the clipper, given that they have two points...