notaz / pcsx_rearmed

ARM optimized PCSX fork
GNU General Public License v2.0
394 stars 210 forks source link

Duke Nukem: Land of the Babes (USA), weird graphical glitch while underwater. #289

Closed valter12437 closed 1 year ago

valter12437 commented 1 year ago

This happens everytime you go in any underwater area in the game: Duke Nukem - Land of the Babes (USA)-230202-184349 I'm using PCSX-ReARMed (r23l aced3eb) on Windows 10 Home trough RetroArch.

pcercuei commented 1 year ago

@valter12437 Do you have a savestate you could share to reproduce the problem?

valter12437 commented 1 year ago

@pcercuei sure, here it goes. Just load the state and jump into the pool. Alternatively, select the training stage, go to the pool area and jump in.

Duke Nukem - Land of the Babes (USA).zip

m4xw commented 1 year ago

Took me 1.5 days but i tracked it to this OP https://github.com/libretro/pcsx_rearmed/blob/master/plugins/gpu_unai/gpulib_if.cpp#L748 Not sure why yet but stubbing it fixes it for unai, will investigate further tomorrow, up for ideas if anyone has

m4xw commented 1 year ago

On another note i also fixed unit tests on some GTE ops but guess will handle that seperately since it didnt turn out to be that

m4xw commented 1 year ago

grafik This appears to fix it, might be a off-by-one in all gpu plugins? AFAIK this effect works on hardware, but this seems like a weird oversight if u ask me, at least silently discarding it doesnt do the trick in this case grafik It does set those params as well and sends data with those locations in mind so that would check out. peops had same issue as unai here but neon gpu was more funky, I hope you know whats going on here, @notaz Also if u look at the original image theres some shenanigans at the right edge etc so thats also reaffirming

m4xw commented 1 year ago

Double checking against using 511/1023 and using the top codepath with >= for them still is broken, so appears to me this needs indeed 1024/512 grafik

pcercuei commented 1 year ago

Using 512/1024 does make absolutely zero sense, though.

m4xw commented 1 year ago

As far i can tell from the games code it looks to be doing some offscreen vertices

pcercuei commented 1 year ago

What do you get if you print x0 / y0 / x1 / y1 / w0 / h0? Is that your screenshot with the "set drawing area" messages?

m4xw commented 1 year ago

The screenshot was this

m4xw commented 1 year ago

Here i just dumped the umasked vals for one frame:

x0: 512, y0: 2, x1: 511, y1: 2, w0: 512, h0: 2
x0: 512, y0: 4, x1: 511, y1: 4, w0: 512, h0: 2
x0: 512, y0: 6, x1: 511, y1: 6, w0: 512, h0: 2
x0: 512, y0: 8, x1: 511, y1: 8, w0: 512, h0: 2
x0: 512, y0: 10, x1: 511, y1: 10, w0: 512, h0: 2
x0: 512, y0: 12, x1: 510, y1: 12, w0: 512, h0: 2
x0: 512, y0: 14, x1: 510, y1: 14, w0: 512, h0: 2
x0: 512, y0: 16, x1: 510, y1: 16, w0: 512, h0: 2
x0: 512, y0: 18, x1: 510, y1: 18, w0: 512, h0: 2
x0: 512, y0: 20, x1: 510, y1: 20, w0: 512, h0: 2
x0: 512, y0: 22, x1: 510, y1: 22, w0: 512, h0: 2
x0: 512, y0: 24, x1: 509, y1: 24, w0: 512, h0: 2
x0: 512, y0: 26, x1: 509, y1: 26, w0: 512, h0: 2
x0: 512, y0: 28, x1: 509, y1: 28, w0: 512, h0: 2
x0: 512, y0: 30, x1: 509, y1: 30, w0: 512, h0: 2
x0: 512, y0: 32, x1: 509, y1: 32, w0: 512, h0: 2
x0: 512, y0: 34, x1: 509, y1: 34, w0: 512, h0: 2
x0: 512, y0: 36, x1: 508, y1: 36, w0: 512, h0: 2
x0: 512, y0: 38, x1: 508, y1: 38, w0: 512, h0: 2
x0: 512, y0: 40, x1: 508, y1: 40, w0: 512, h0: 2
x0: 512, y0: 42, x1: 508, y1: 42, w0: 512, h0: 2
x0: 512, y0: 44, x1: 508, y1: 44, w0: 512, h0: 2
x0: 512, y0: 46, x1: 508, y1: 46, w0: 512, h0: 2
x0: 512, y0: 48, x1: 508, y1: 48, w0: 512, h0: 2
x0: 512, y0: 50, x1: 508, y1: 50, w0: 512, h0: 2
x0: 512, y0: 52, x1: 508, y1: 52, w0: 512, h0: 2
x0: 512, y0: 54, x1: 508, y1: 54, w0: 512, h0: 2
x0: 512, y0: 56, x1: 508, y1: 56, w0: 512, h0: 2
x0: 512, y0: 58, x1: 508, y1: 58, w0: 512, h0: 2
x0: 512, y0: 60, x1: 508, y1: 60, w0: 512, h0: 2
x0: 512, y0: 62, x1: 508, y1: 62, w0: 512, h0: 2
x0: 512, y0: 64, x1: 508, y1: 64, w0: 512, h0: 2
x0: 512, y0: 66, x1: 508, y1: 66, w0: 512, h0: 2
x0: 512, y0: 68, x1: 508, y1: 68, w0: 512, h0: 2
x0: 512, y0: 70, x1: 508, y1: 70, w0: 512, h0: 2
x0: 512, y0: 72, x1: 508, y1: 72, w0: 512, h0: 2
x0: 512, y0: 74, x1: 508, y1: 74, w0: 512, h0: 2
x0: 512, y0: 76, x1: 508, y1: 76, w0: 512, h0: 2
x0: 512, y0: 78, x1: 508, y1: 78, w0: 512, h0: 2
x0: 512, y0: 80, x1: 508, y1: 80, w0: 512, h0: 2
x0: 512, y0: 82, x1: 508, y1: 82, w0: 512, h0: 2
x0: 512, y0: 84, x1: 508, y1: 84, w0: 512, h0: 2
x0: 512, y0: 86, x1: 508, y1: 86, w0: 512, h0: 2
x0: 512, y0: 88, x1: 508, y1: 88, w0: 512, h0: 2
x0: 512, y0: 90, x1: 508, y1: 90, w0: 512, h0: 2
x0: 512, y0: 92, x1: 508, y1: 92, w0: 512, h0: 2
x0: 512, y0: 94, x1: 508, y1: 94, w0: 512, h0: 2
x0: 512, y0: 96, x1: 509, y1: 96, w0: 512, h0: 2
x0: 512, y0: 98, x1: 509, y1: 98, w0: 512, h0: 2
x0: 512, y0: 100, x1: 509, y1: 100, w0: 512, h0: 2
x0: 512, y0: 102, x1: 509, y1: 102, w0: 512, h0: 2
x0: 512, y0: 104, x1: 509, y1: 104, w0: 512, h0: 2
x0: 512, y0: 106, x1: 509, y1: 106, w0: 512, h0: 2
x0: 512, y0: 108, x1: 510, y1: 108, w0: 512, h0: 2
x0: 512, y0: 110, x1: 510, y1: 110, w0: 512, h0: 2
x0: 512, y0: 112, x1: 510, y1: 112, w0: 512, h0: 2
x0: 512, y0: 114, x1: 510, y1: 114, w0: 512, h0: 2
x0: 512, y0: 116, x1: 510, y1: 116, w0: 512, h0: 2
x0: 512, y0: 118, x1: 510, y1: 118, w0: 512, h0: 2
x0: 512, y0: 120, x1: 511, y1: 120, w0: 512, h0: 2
x0: 512, y0: 122, x1: 511, y1: 122, w0: 512, h0: 2
x0: 512, y0: 124, x1: 511, y1: 124, w0: 512, h0: 2
x0: 512, y0: 126, x1: 511, y1: 126, w0: 512, h0: 2
x0: 512, y0: 128, x1: 511, y1: 128, w0: 512, h0: 2
x0: 512, y0: 140, x1: 513, y1: 140, w0: 512, h0: 2
x0: 512, y0: 142, x1: 513, y1: 142, w0: 512, h0: 2
x0: 512, y0: 144, x1: 513, y1: 144, w0: 512, h0: 2
x0: 512, y0: 146, x1: 513, y1: 146, w0: 512, h0: 2
x0: 512, y0: 148, x1: 513, y1: 148, w0: 512, h0: 2
x0: 512, y0: 150, x1: 513, y1: 150, w0: 512, h0: 2
x0: 512, y0: 152, x1: 514, y1: 152, w0: 512, h0: 2
x0: 512, y0: 154, x1: 514, y1: 154, w0: 512, h0: 2
x0: 512, y0: 156, x1: 514, y1: 156, w0: 512, h0: 2
x0: 512, y0: 158, x1: 514, y1: 158, w0: 512, h0: 2
x0: 512, y0: 160, x1: 514, y1: 160, w0: 512, h0: 2
x0: 512, y0: 162, x1: 514, y1: 162, w0: 512, h0: 2
x0: 512, y0: 164, x1: 515, y1: 164, w0: 512, h0: 2
x0: 512, y0: 166, x1: 515, y1: 166, w0: 512, h0: 2
x0: 512, y0: 168, x1: 515, y1: 168, w0: 512, h0: 2
x0: 512, y0: 170, x1: 515, y1: 170, w0: 512, h0: 2
x0: 512, y0: 172, x1: 515, y1: 172, w0: 512, h0: 2
x0: 512, y0: 174, x1: 515, y1: 174, w0: 512, h0: 2
x0: 512, y0: 176, x1: 515, y1: 176, w0: 512, h0: 2
x0: 512, y0: 178, x1: 515, y1: 178, w0: 512, h0: 2
x0: 512, y0: 180, x1: 515, y1: 180, w0: 512, h0: 2
x0: 512, y0: 182, x1: 515, y1: 182, w0: 512, h0: 2
x0: 512, y0: 184, x1: 515, y1: 184, w0: 512, h0: 2
x0: 512, y0: 186, x1: 515, y1: 186, w0: 512, h0: 2
x0: 512, y0: 188, x1: 515, y1: 188, w0: 512, h0: 2
x0: 512, y0: 190, x1: 515, y1: 190, w0: 512, h0: 2
x0: 512, y0: 192, x1: 515, y1: 192, w0: 512, h0: 2
x0: 512, y0: 194, x1: 515, y1: 194, w0: 512, h0: 2
x0: 512, y0: 196, x1: 515, y1: 196, w0: 512, h0: 2
x0: 512, y0: 198, x1: 515, y1: 198, w0: 512, h0: 2
x0: 512, y0: 200, x1: 515, y1: 200, w0: 512, h0: 2
x0: 512, y0: 202, x1: 515, y1: 202, w0: 512, h0: 2
x0: 512, y0: 204, x1: 515, y1: 204, w0: 512, h0: 2
x0: 512, y0: 206, x1: 515, y1: 206, w0: 512, h0: 2
x0: 512, y0: 208, x1: 515, y1: 208, w0: 512, h0: 2
x0: 512, y0: 210, x1: 515, y1: 210, w0: 512, h0: 2
x0: 512, y0: 212, x1: 515, y1: 212, w0: 512, h0: 2
x0: 512, y0: 214, x1: 515, y1: 214, w0: 512, h0: 2
x0: 512, y0: 216, x1: 515, y1: 216, w0: 512, h0: 2
x0: 512, y0: 218, x1: 515, y1: 218, w0: 512, h0: 2
x0: 512, y0: 220, x1: 515, y1: 220, w0: 512, h0: 2
x0: 512, y0: 222, x1: 515, y1: 222, w0: 512, h0: 2
x0: 512, y0: 224, x1: 514, y1: 224, w0: 512, h0: 2
x0: 512, y0: 226, x1: 514, y1: 226, w0: 512, h0: 2
x0: 512, y0: 228, x1: 514, y1: 228, w0: 512, h0: 2
x0: 512, y0: 230, x1: 514, y1: 230, w0: 512, h0: 2
x0: 512, y0: 232, x1: 514, y1: 232, w0: 512, h0: 2
x0: 512, y0: 234, x1: 514, y1: 234, w0: 512, h0: 2
x0: 512, y0: 236, x1: 513, y1: 236, w0: 512, h0: 2
x0: 512, y0: 238, x1: 513, y1: 238, w0: 512, h0: 2
....
    x0 = packet.U2[2];
    y0 = packet.U2[3];
    x1 = packet.U2[4];
    y1 = packet.U2[5];
    w0 = packet.U2[6];
    h0 = packet.U2[7];

    if( (x0==x1) && (y0==y1) ) return;
    if ((w0<=0) || (h0<=0)) return;

    printf("x0: %d, y0: %d, x1: %d, y1: %d, w0: %d, h0: %d\n", x0, y0, x1, y1, w0, h0);
    fflush(stdout);
....
m4xw commented 1 year ago

Side Note: When i open the menu ingame (and it doesnt apply the caustic effect underwater) then it never invokes moveImage to begin with so the issue never appears

m4xw commented 1 year ago

Btw i could give some further insight if you contact me on discord i cant say too much on github due to circumstances m4xw#1560

pcercuei commented 1 year ago

The values look OK. I think it's just the algorithm that does not work properly with these values. It looks like the game tries to move lines of the framebuffer left and right to give it a wobble "underwater" effect. In the case where x1 > x0, the algorithm will actually copy pixel N[i] to N[i + n], which only works if you copy less than n pixels.

m4xw commented 1 year ago

The wobble effect doesnt happen with my change as far i can tell (checking over RDP rn and its a blurry mess)

m4xw commented 1 year ago

I do have some indication this might be abusing either a bug or a edge case to get the effect without the artifact, it works on beetle-psx fwiw

Squall-Leonhart commented 1 year ago

@Nucleoprotein do you have an idea here?

pcercuei commented 1 year ago

The bug seems to occur in all GPU plugins (neon, peops, unai).

The problem is that the VRAM->VRAM copy does not handle the case where the source and destination overlap.

One workaround in Unai is to copy the source to an intermediate buffer first:

diff --git a/plugins/gpu_unai/gpu_raster_image.h b/plugins/gpu_unai/gpu_raster_image.h
index 0c82aa97..31ad2692 100644
--- a/plugins/gpu_unai/gpu_raster_image.h
+++ b/plugins/gpu_unai/gpu_raster_image.h
@@ -81,14 +81,19 @@ INLINE void gpuMoveImage(void)
        if( (x0==x1) && (y0==y1) ) return;
        if ((w0<=0) || (h0<=0)) return;

-       if (((y0+h0)>512)||((x0+w0)>1024)||((y1+h0)>512)||((x1+w0)>1024))
+       if (1)
        {
                u16 *psxVuw=GPU_FrameBuffer;
                s32 i,j;
-           for(j=0;j<h0;j++)
-                for(i=0;i<w0;i++)
-                 psxVuw [(1024*((y1+j)&511))+((x1+i)&0x3ff)]=
-                  psxVuw[(1024*((y0+j)&511))+((x0+i)&0x3ff)];
+               u16 *tmp = (u16 *)alloca(sizeof(*tmp) * w0 * h0);
+
+               for (j = 0; j < h0; j++)
+                       for (i = 0; i < w0; i++)
+                               tmp[j * w0 + i] = psxVuw[1024*((y0+j)&511)+((x0+i)&0x3ff)];
+
+               for (j = 0; j < h0; j++)
+                       for (i = 0; i < w0; i++)
+                               psxVuw[1024*((y1+j)&511)+((x1+i)&0x3ff)] = tmp[j * w0 + i];
        }
        else if ((x0&1)||(x1&1))
        {

Of course this isn't very optimized, but at least it works.

Squall-Leonhart commented 1 year ago

The bug seems to occur in all GPU plugins (neon, peops, unai).

It doesn't occur in Kazzuya Soft

notaz commented 1 year ago

This should be fixed now. Many thanks to everyone who helped debugging this and posted info.

pcercuei commented 12 months ago

@notaz, your commit 36da9c1305 breaks Looney Tunes Sheep Raider. When trying to enter the very first level, the loading screen will never complete. I have no idea how it can be related; but reverting this commit on master fixes it.

m4xw commented 12 months ago

Also I got told unai broke as well (may or may not be related to that commit)

pcercuei commented 12 months ago

@m4xw broke how? We're using Unai and I didn't notice breakages related to this commit.

notaz commented 12 months ago

When trying to enter the very first level, the loading screen will never complete

Well it actually completed but it wasn't visible because I forgot to mark the fb for flipping. It should be fixed now.

m4xw commented 12 months ago

Duke TTK (Corruption on startup) image

https://github.com/notaz/pcsx_rearmed/assets/13141469/6e0182cd-ec2e-4016-bb2b-62b933ebd810

m4xw commented 12 months ago

Dunno whats up with the video so heres another upload https://m4xw.net/nextcloud/index.php/s/akMMf2diqQYsZ7M Edit: Actually seems to not playback too, wth. Anyway theres a flicker happening Actually seems to work in VLC, weird.

Ploggy commented 12 months ago

The flickering is to do with Unai's Dithering, turn it off and the flickering will stop.