PCSX2 / pcsx2

PCSX2 - The Playstation 2 Emulator
https://pcsx2.net
GNU General Public License v3.0
11.87k stars 1.63k forks source link

[BUG] Nightshade (Kunoichi) Shadows #3988

Open tadanokojin opened 3 years ago

tadanokojin commented 3 years ago

Describe the bug Shadows are broken and a rectangle appears on the right side of the screen

To Reproduce Play the dump

Expected behavior Shadows to function and rectangle not present

GS Settings defaults (either HW render)

Emulation Settings N/A

GS Window Screenshots 006257

System Info (please complete the following information):

Logs and Dumps https://dumps.pcsx2.net/index.php/s/Dji9HQYKFawqWGe

tadanokojin commented 3 years ago

related #1919

tadanokojin commented 3 years ago

The problem is there's a BLIT from one area of GS memory to another and our current cache doesn't handle it very well. Instead of just copying the high resolution texture it invalidates it to GS memory and invalidates GS memory back into a texture. So I hacked together a host BLIT and the result looks good:

006261

However there are some problems with this approach:

GSdx is going to have to be redesigned a bit to handle this situation. Note: this will also fix #3169

iMineLink commented 3 years ago

Could it help being able to hit a previous framebuffer with a given offset?

tadanokojin commented 3 years ago

@iMineLink

It might but I think the issue for me is the way the cache handles invalidation. Rather than relying on making the cache hit on overlap I think it might be wiser to defer the actual invalidation until the data is needed. Current process is this:

  1. game blits 64xH area of rt to 64xH area of rt2
  2. gsdx invalidates 64xH area
  3. gsdx reads target out and puts it into gs memory
  4. game blits 64xH area of rt+offset to 64xH area of rt2+offset
  5. gsdx invalidates 64xH area
  6. gsdx tries to find target but misses due to offset
  7. ...
  8. gsdx gets new target but bad data

One solution might be:

  1. game blits 64xH area of rt to 64xH area of rt2
  2. gsdx invalidates 64xH area
  3. gsdx does host blit of 64xH area
  4. game blits 64xH area of rt+offset to 64xH area of rt2+offset
  5. gsdx invalidates 64xH area
  6. gsdx does host blit (plus a bunch of math to find the right target) of 64xH area
  7. ...
  8. gsdx gets new target and good data

But that sounds like it will incur some overhead.

So my thought was this:

  1. game blits 64xH area of rt to 64xH area of rt2
  2. gsdx invalidates 64xH area
  3. game blits 64xH area of rt+offset to 64xH area of rt2+offset
  4. gsdx invalidates 64xH area (area is now 128xH)
  5. ...
  6. gsdx wants new target so lookup causes the host blit to happen across the whole invalidated area at once
  7. gsdx gets new target and good data

Does this make sense to you?

iMineLink commented 3 years ago

Yes it makes sense. You need to keep track of all the issued BLITs (source area/destination area) and defer them until needed to either read from or write to (like in this case, during the target lookup) the relevant GS memory area, to avoid any incoherence. Maybe it is also possible to preserve the upscaling of the framebuffer and to avoid the readback by performing a single texture copy operation via shader to the new framebuffer. Instead if you readback you can readback the whole area at once, then upload it to the new framebuffer before drawing, but upscaling is lost. Does this makes sense?

tadanokojin commented 3 years ago

Yes, I'm saying we shall host blit (ie, copy with shader like you say) in addition to placing src/dest into gs memory (I'm not sure it will matter but to be safe, we can relax or optimize later). This way we get a miss in the cache and preserve scaling. I want to avoid preloading the rt data and instead on lookup say "this is destination of a blit, we shall find the old rt and copy rectangular area (and maybe put in gs mem)".

iMineLink commented 3 years ago

The "miss" you refer to is at "point 6" when you lookup for the target rt2, right?

If it is, at that stage, you might have a data structure in the TC that kept track of the base pointer and the extension in pages/pixels of all the BLITs, then you can copy data from old rt (retrieved by FBP) to new rt2 (considering the whole merged area at once).

This way you create and return from the "point 6" lookup the correct Target object as a copy of the one wrapping the old rt, but with changed FBP.

It might be a little tricky to get the state machine tracking the sequence of the BLITs working correctly, but if it is a bursts of similar operations just with base pointer changed, the detection might be precise enough to avoid a per-game patch. :P

seta-san commented 1 year ago

no strange rectangles.. just no shadow in hardware mode.

1.7.4920