libretro / beetle-psx-libretro

Standalone port/fork of Mednafen PSX to the Libretro API.
GNU General Public License v2.0
308 stars 131 forks source link

PGXP Support #85

Closed Papermanzero closed 6 years ago

Papermanzero commented 8 years ago

Adding PGXP would increase the image quality Details: http://ngemu.com/threads/pcsxr-pgxp.186369/

simias commented 8 years ago

I haven't looked at PGXP's implementation yet but in order to do that you'd have to add a side-channel in mednafen's GTE, RAM and DMA code to tunnel additional data from the GTE all the way to the GPU.

Here's how I thought it could be implemented, in theory, in case somebody else wants to give it a try:

GTE

Here you can see where the perspective division is done in the GTE: https://github.com/libretro/beetle-psx-libretro/blob/master/mednafen/psx/gte.cpp#L1065-L1070

All the 16bit of fractional precision are lost in the bitshift before the result is stored in the GTE's XY_FIFO register, effectively truncating to the previous pixel. This is what causes a lot of jitter in PSX graphics, you don't have any subpixel precision.

At the same time you can see that the code doesn't store the z value anywhere, it's used for the perspective division and then discarded. Without this value in the GPU we can't do perspective-correct texture mapping which results in warping when viewing polygons not directly facing the camera.

In order to solve these issues we have to keep the enhanced precision x and y as well as the z value somewhere. For instance we could change XY_FIFO to hold those new "increased precision" variables alongside the regular integer XY values.

RAM

Once that's working you need to patch the code that sends GTE data to RAM. This should mostly happen here. It's possible that some games don't use the SWC2 opcode and instead use MFC2 to move the GTE register value to a main CPU register and then use a regular SW to put it into main RAM or even straight to the GPU but since it's less efficient it's probably not super common.

Anyway, here you'd have to special case reads from the XY_FIFO to store the increased precision data in some buffer tied to the main RAM.

DMA

Then you need to patch the DMA to send the increased precision data from RAM to GPU. The DMA reads the data from RAM here and sends it to the GPU here. This code will have to be patched to give the GPU a way to access the increased precision data.

GPU

Now when all that's working you got all what you need in the GPU to do some fancy sub-pixel precise perspective-correct mapped 3D graphics. With the GL renderer it should be pretty easy to just plug in the "enhanced" values in the vertex buffer and let the hardware do its thing.

With the software renderer however things are not as easy, it would have to be modified to add support for subpixel precision and perspective correct mapping. It's doable but it's pretty tricky work I think. It would also probably slow it down significantly.

For this reason I just though it wasn't worth looking into until the GL renderer gets usable.

simias commented 8 years ago

I should also add that it'd probably be a good idea to look at PGXP to see how they do things, maybe I forgot something here or there's a more efficient way to do some of those things. I dunno.

inactive123 commented 8 years ago

Well, if you think you can implement it already but the GL renderer's performance is currently an impediment for you to implement, then I suggest we implement it already. This is a pretty big and important feature for a lot of people and having it as a core option would help increase interest in the GL renderer tremendously.

I will be contacting @Themaister soon and asking him if he can provide us with some advice on how to eliminate some of the bottlenecks that currently exist.

iCatButler commented 8 years ago

Hi, someone just linked me to this thread so I thought I'd chime in.

Have you considered using PGXP itself, rather than writing your own? It was my intention to drop it into a fork of this project once it was further along but it shouldn't be too difficult now if you're happy with the current functionality.

The way the current PCSX-R fork is structured there's already pretty good separation between the emulator specific features and PGXP's mirroring of memory, CPU and GTE functions, the GPU side of things still hasn't been addressed.

Once the memory has been initialised it's just a case of pushing values into PGXP from calls to RTPS and RTPT, then ensuring that all calls to certain CPU/GTE functions also call into PGXP so it can duplicate the results at higher precision. The GPU then uses memory locations for individual vertices to retrieve the higher precision values (with the option to fall back on vertex caching, Edgbla's method, if something has gone wrong).

I can understand if you'd still prefer to create your own version of the feature. The code for PGXP is all on github and I'll try to help if you have any questions. I would say that CPU support is by far the most time consuming part I've discovered (so far) as many games modify some or all of their vertices on the CPU before writing them to final locations.

andres-asm commented 8 years ago

@iCatButler we certainly would like PGXP support, but we were not clear on the details, any help is always appreciated

Can you join us on IRC? #libretro in freenode

inactive123 commented 8 years ago

It was my intention to drop it into a fork of this project once it was further along but it shouldn't be too difficult now if you're happy with the current functionality.

If you're intent on doing this that would be awesome to see. I think we have no problems with using your implementation as-is.

simias commented 8 years ago

@iCatButler thanks for chiming in.

As I said I haven't looked at PGXP yet, I assumed that the code would be very emulator-specific but it turns out I was wrong apparently. If the code is portable then by all means let's reuse that.

If I understand what you said correctly we'll still have to patch mednafen's DMA code to send the memory locations alongside the values for the GPU to be able to retrieve the high-precision values.

It shouldn't be complicated to do though, especially if we can just plug your code directly. Personally I still want to improve the GL renderer before I attempt that but it's almost completely orthogonal so if somebody wants to give it a try don't hesitate. I should be able to help with the GL interface at least.

simias commented 8 years ago

Oh and as a side note we should be careful about license issues if we integrate PGXP with mednafen. I don't know what license you use @iCatButler but mednafen is GPLv2+.

iCatButler commented 8 years ago

@simias it was relatively dependent on PCSX-R to begin with, but I quickly realised that I had to track more and more of the CPU functions, so I changed the interface to reflect the PSX's operations which should be common to most emulators.

You're correct, it will need to pass the address of the current primitive or dma chain entry, I think because of the way Pete's OpenGL plugins work it currently stores the address as each chain entry is read and then steps forward from there to find the current primitive address during rendering.

I'm still working to improve compatibility and quality for CPU operations (shift operators are particularly difficult) and plan to expand GTE support. I'll try integrating what's there with mednafen when I get a chance, the PGXP files are currently also under GPLv2+ so I don't think there should be any issues.

simias commented 8 years ago

Awesome, don't hesitate to let me know if you need any help.

iCatButler commented 8 years ago

Okay, I've managed to integrate PGXP, it does have a few problems at the moment: https://github.com/iCatButler/beetle-psx-libretro

Initially the solution wouldn't build in Visual Studio due to several linker errors to do with "video_cb" and "environ_cb", I managed to work around these by pulling in Libertro_cbs and modifying a few declarations but when switching PGXP to "memory + CPU" mode it crashes with the debugger showing "video_cb" as NULL, so I may have broken something there.

It also crashes with Software framebuffer enabled, presumably because those will need some changes to handle the modified vertex structures, I changed them to have 4 floating point components to handle the increase precision and added w.

I've also implemented menu options to switch modes and toggle caching (the same as sub-pixel vertices) and perspective texture correction, this is a very quick demonstration: https://www.youtube.com/watch?v=jC4OwvK0hcM

simias commented 8 years ago

Nice job!

I'm going to give it a try and see if I can fix any of the remaining issues.

simias commented 8 years ago

By the way I see that use use the w homogeneous coordinate to do texture correction, wouldn't just setting the z coordinate to the GTE value achieve the same thing and give us a working z-buffer "for free"? Or doesn't it work because games don't use coherent z-values?

iCatButler commented 8 years ago

Entities often appear to be transformed in different spaces or scales, for instance the sky box will be small and incredibly close to the camera but rendered first in the OT so everything draws on top. In most of the games I've tested depth testing caused elements to draw in the wrong order, so the sky obscures much of the scene. I did try reconstructing depth values using the OT positions too, but those aren't always related to depth either and simply force rendering order.

inactive123 commented 8 years ago

hi @iCatButler, I sent you a pull request on the fork.

We do still have an floating point exception crash that happens in one of the templated function variations of PS_GPU::DrawTriangle though.

@simias tells me he can get it PGXP to run but he can't change internal resolution or he gets a floating point exception. For me it's a different story, as soon as I set PGXP from memory, I get a floating point exception.

iCatButler commented 8 years ago

I found that I needed to disable "software framebuffer" if I changed any options in the menu, otherwise it would go through into what appeared to be a software triangle rasterisation function which I haven't updated for the floating point vertex positions. It sounds like you might be getting a similar error.

inactive123 commented 8 years ago

OK you're right, it seems with Software framebuffer turned off it doesn't crash.

inactive123 commented 8 years ago

LOL, there's quite some ingenious bugs going on when I set PGXP operation mode to 'memory + CPU'. In Tomb Raider, I can see the sprites on the ground (bushes, etc) falling out of thin air onto the ground and then this animation cycle keeps repeating. Seems almost like deja vu in The Matrix LOL.

inactive123 commented 8 years ago

Hmm, there was some kind of regression that happened in 'Initial PGXP conversion' - I think Ridge Racer Revolution no longer starts. You can start it but you can't enter into a race, it hangs there.

I think if you go back to the commit prior to this 582b867954 (Update gte.cpp), it still worked.

inactive123 commented 8 years ago

I think the same issue affects Rage Racer too.

iCatButler commented 8 years ago

@twinaphex I've sent you a couple of small changes that might fix CPU mode. Bear in mind that CPU mode actually breaks the collision detection in the Ridge Racer games as they the NCLIP operation as part of the calculations.

I think the software rasterisation crash might be due to something like a divide by 0, there are a lot of places where two vertex positions are subtracted and the result truncated to an int.

inactive123 commented 8 years ago

OK. BTW, one thing I have noticed is that perspective correct texturing is not yet perfect. For instance, enabling PGX and setting it to 'memory only' and leaving perspective correct off still renders the skybox fine, but if I set perspective correct textures to enabled, it breaks the skybox.

This happens in Hwang's stage in Soul Blade as well, as well as Ridge Racer Type 4. So it seems to be a recurring pattern in some games.

inactive123 commented 8 years ago

Let me see if the latest changes fixes that as well. Will report back.

inactive123 commented 8 years ago

Ridge Racer Revolution/Rage Racer still doesn't work.

The skybox issues when perspective correct textures is enabled (Ridge Racer Type 4/Gran Turismo/Soul Blade in some stages like Hwang) still happens.

EDIT: Same skybox issues in Chrono Cross too.

iCatButler commented 8 years ago

The skybox issue doesn't happen in PCSX-R. It could be that the fixed function pipeline used in that is catching bad w values that are making it through here.

BTW I've opened up issue tracking on my repo now. Hopefully it won't take long to get this build up to the same level as the PCSX-R one.

inactive123 commented 8 years ago

OK, cool.

Noticed another regression which didn't happen before the PGXP commit - this affects it without PGXP turned on too -

http://imgur.com/a/oXaco

The battle sphere in Vagrant Story is no longer drawn. This screenshot was taken with https://github.com/libretro/beetle-psx-libretro/commit/582b867954c9ce525d47301a9f032b311a4028a4, where it still worked.

simias commented 8 years ago

In order not to break the SW renderer I think PGXP shouldn't change the values used by the software. I'm going to try and change that, I'll also remove the half-assed subpixel precision code I had before, it's a bit redundant and rather useless.

simias commented 8 years ago

I've pushed a bunch of modifications to @twinaphex 's master branch. It shouldn't improve PGXP integration at all but hopefully it shouldn't break anything either, I mostly tried to unbreak SW and "regular" opengl code. I tried to make it so the code would behave exactly as it did previously when PGXP is disabled.

As @twinaphex pointed out one interesting game to test is Ridge Racer Revolution, in hangs completely when PGXP is enabled. I haven't taken the time to figure out why yet, I'm suspecting it has to do with a change in behaviour of the GTE code when it's enabled since I can't imagine what else could crash the whole game. Just my two cents. And now I'm off to bed.

simias commented 8 years ago

The freezing in RRR when PGXP is enabled is due to the call to PGXP_NCLIP in the GTE NCLIP command.

The values returned by PGXP_NCLIP() don't match the values computed by mednafen at all in the game:

Mednafen NCLIP PGXP_NCLIP
0 -2147483648
0 -2147483648
0 -2147483648
0 -2147483648
85974 -2147483648
-85974 -2147483648

If I compare with a game that does work like Tomb Raider 3 I end up with values that are much closer, however they still don't match with mednafen's output:

Mednafen NCLIP PGXP_NCLIP
955 976
-627 -629
465 480
-506 -510
32 32
-39 -39
468 434
-413 -400
492 488
-427 -440
146 131
-118 -120
194 204
-204 -204
196 215
-200 -191
545 552
-641 -655
-622 -653
526 568
-6 -9
6 -15
175 200
-227 -250
simias commented 8 years ago

Other potentially interesting test cases are Spyro and MGS, I know those are tricky to get working with PGXP but at the moment they look very broken when it's enabled:

retroarch-0926-161851

retroarch-0926-161905

Looks like they use completely invalid w values.

simias commented 8 years ago

@iCatButler I've started playing with PGXP to try and implement a few things I had in mind, when you have the time could you document the meaning of the VALID_ flags as well an the values of the PGXP_flag in OGLVertex? I think I managed to piece out what some of them mean but I think it's worth documenting for posterity. Maybe using some kind of symbolic value would make the code a bit easier to read and understand too, conditionals like if ((vert.PGXP_flag != 1) && (vert.PGXP_flag != 3)) are a bit cryptic at a glance.

I was trying to implement a better widescreen hack that would rescale both 3D and 2D elements by using PGXP to figure out whether the vertex went through the GTE or not.

By looking at PGXP_GetVertex I guessed that I should test for vert.PGXP_flag == 2 to see if it didn't go through the GTE, unfortunately it seems to result in many false-positives for the games I've tried so it's not really usable at the moment. Maybe I'll give it an other try when the current PGXP issues are fixed.

iCatButler commented 8 years ago

@simias sorry I've not had a lot of time to look at this recently.

I'd expect the NCLIP values to differ slightly because they're created using the higher precision coordinates, there is also a slight bias added to small values to ensure they don't become 0. This fixes culling issues where small triangles would become degenerate using the native PSX code. http://i.imgur.com/JbFjTxY.png

In the case of RRR it looks like it should be rejecting the values as invalid rather than using them, is this happening with "Memory + CPU" mode enabled or also with "Memory Only"? I still can't get "+ CPU" mode to run most of the time as it crashes at the end of void GlRenderer::finalize_frame() upon re-entering the game with video_cb apparently NULL. I assume there's something very wrong going on in the "+CPU" integration code or in my modifications to get the original project to compile.

Both Spyro and MGS need "+CPU" mode in order to render correctly, otherwise some triangles have a mixture of native and PGXP vertices which should discard the W component for the whole triangle. If they're not behaving correctly with "+ CPU" mode it could be linked to the crash I'm seeing.

The valid flags refer to the status of the individual components of a PGXP_value, the first two correspond to the two 16-bit words or XY coordinates, the third is currently used for the additional W coordinate and the fourth could be used for a Z coordinate or other future data. The idea being that any untracked operation on a vertex's components will render the data invalid and that status will be propagated all the way through to the GPU, making it possible to discard them and trace them backward using debug output. I chose to just refer to them by number because it's also possible to access the components separately and compare to the base VALID and also in future they could be used for non-positional data like texture coordinates too.

The vert.PGXP_flag was previously only used for debug output, it stores the source of the vertex and was used to display different colours in one of the PCSX-R plugins, it only remains here because we need to know if W is valid. We could replace it entirely with a simple valid_W flag or add more descriptive names. Values that aren't currently tracked by PGXP could either bypass the GTE completely or may have been lost between the GTE and GPU, some vertices in Spyro currently get sent into other untracked GTE operations to be interpolated. So there isn't necessarily a way to tell if any particular native value was scaled by the GTE just yet.

iCatButler commented 8 years ago

@simias I've noticed a few places where I've been unclear on how things work, for instance caching should work without the use of the main PGXP mode. I'll try to tidy up and better comment some of these areas and check it in.

simias commented 8 years ago

Okay, thanks for the explanations, I'm mostly running with memory only for the time being, I should add + CPU for my subsequent testing. I didn't think it would matter so much for so many games.

Your crash is weird, I don't have any hard crash left with the current code. Have you tried running the latest version? Otherwise there might be some other factor at play, I'll try to reproduce. Could also be a Windows-only problem, I never test it there but I probably should.

The cache option of PGXP is just the "old school" GTE accuracy hack, right? You just use the vertex coordinates to lookup the extended precicsion values in a table, and then you check for collision?

iCatButler commented 8 years ago

Yes, the cache is pretty much identical to Edgbla's original method. It's mostly useful for games which don't completely work with "Memory Only" but aren't improved or fail with "+CPU" mode. I did add some extra checks for stale data but they don't appear to work in Mednafen.

In PCSX-R it seems that the GTE is allowed to process all the vertices before they're seen by the GPU, so I could define a range of valid vertex id's (or counts) for this particular session. In Mednafen only a portion of the vertices on screen appear to be in the current session so I assume it's modelling some sort of concurrent behaviour between the GTE and GPU that PCSX-R was ignoring.

iCatButler commented 8 years ago

I've updated to the latest (was only missing the widescreen changes) and am still getting the same crash.

I've uploaded a change to enable caching without the main PGXP mode being enabled, I've also tidied up and better commented the GPU code. Should I be using pull requests to send these to @twinaphex ?

rz5 commented 8 years ago

Might be worth putting this here: if you're cleaning objects with 'make clean', consider cleaning it 'make clean HAVE_OPENGL=1'. It was mentioned in the IRC that the latter cleans more objects than the former...had a quick glance at the commits for this repo and I don't think this issue was addressed yet.

simias commented 8 years ago

I think we should merge to libretro/ now, but I'm not 100% sure.

You build with Visual Studio, right? I think I'll give it a try.

inactive123 commented 8 years ago

Hi @iCatButler, you can submit changes to the libretro beetle-psx repo now.

@simias - merged your widescreen patch from the twinaphex repo. I think we can probably start closing that repo soon since all the patches are in libretro repo now, but I'll leave the decision up to you.

iCatButler commented 8 years ago

I've fixed the RRR crash and the skybox issues, both were cases of values from the GTE getting out of the valid range. Sorry it's taken me so long to have a proper look at these.

In the case of RRR it was vertices with a depth of 0, resulting in the X and Y coordinates becoming +-inf after division. I've clamped those to the range expected by the psx -1024-1023. The skybox issue was caused by W (GTE Z) being cast from unsigned to signed short and wrapping into negative values near the far plane.

@simias the max(H/2, Z) was originally there for similar reasons in the PCSXR integration, when W values crossed the near plane they were throwing up GPU/driver specific errors (particularly on Nvidia cards). That doesn't appear to be happening when using vertex shaders though.

I don't seem to be able to push changes directly to the libretro repo, probably something I'm doing wrong. I think I'd feel safer if someone else checks and merges them anyway :) There seem to be some differences in behaviour between VS and GCC (the nightly build lets me use "+CPU" mode with no crashes).

https://github.com/iCatButler/beetle-psx-libretro/commit/61876ee5bd3b0b8a46bed15b325df2a248c89759

rz5 commented 8 years ago

Alright @iCatButler, I took the liberty to make the PR and merge. I did it on github by pressing the 'New pull request' button while viewing the repo, then I select 'compare accross forks' and choose your fork on the right hand side.

simias commented 8 years ago

Well done! Are there still regressions left between the PCSX-R and beetle versions of PGXP?

rz5 commented 8 years ago

After getting it to compile, I can confirm that the skybox on Ridge Racer Type 4 is no longer wonky. Non-jittery perspective-corrected goodness.

iCatButler commented 8 years ago

There's still an issue with invalid W components on some quads, most notably in games like Spyro which cause texture stretching. I'll try to fix this next.

I think the "+CPU" mode might also still have a few bugs, but it's difficult to test. Croc is a game I know relies on it but doesn't seem to work properly with the hardware renderer (it stutters and produces low res blocky visuals, like it's falling back on software somewhere).

@rz5 thanks!

rz5 commented 8 years ago

@iCatButler:

"it stutters and produces low res blocky visuals, like it's falling back on software somewhere."

I get that in Tenchu 2 as well. I can post screenshots, a small webm, a gdb trace or an apitrace file if needed.

SpaceAgeHero commented 8 years ago

Not sure if it's the same problem but I have major graphics issues in Tony Hawks Skateboarding games as well. Also in Tomb Raider 1 textures near the character model keep disappearing. I can try to provide screenshots too if its helpful.

saftle commented 8 years ago

@rz5 @SpaceAgeHero Is it possible that those glitches occur even with all of the PGXP core options turned off? I've noticed some regressions as well and documented a few here: https://github.com/libretro/beetle-psx-libretro/issues/95. I created a seperate thread, since they are regressions that occur even without PGXP being enabled

iCatButler commented 8 years ago

@SpaceAgeHero I've just submitted a PR that should fix the problems with THPS.

SpaceAgeHero commented 8 years ago

@iCatButler thanks. Its indeed fixed. However overall stabilization doesn't seem as good as in other games. Disappearing textures in Tomb Raider seems to be fixed as well. Could that be?

rz5 commented 7 years ago

Since we're using this thread to also track PGXP related bugs:

@iCatButler - When using the 'Perspective corrected textures' option, untextured polygons with solid/graded colors will incorrectly get rendered with some sort of camera-angle-based noise added.

When it's off: https://cloud.githubusercontent.com/assets/8117159/20982318/db8f3b32-bcaf-11e6-898a-b7694cfac5b1.png

When it's on (notice the diamond shaped marker and the katana's guard): https://cloud.githubusercontent.com/assets/8117159/20982332/f2892c26-bcaf-11e6-8f24-75f9fe5ede7c.png