snes9xgit / snes9x

Snes9x - Portable Super Nintendo Entertainment System (TM) emulator
http://www.snes9x.com
Other
2.61k stars 451 forks source link

The low-res image is shifted left by one column of pixels in high-res mode? #165

Closed vaguerant closed 7 years ago

vaguerant commented 7 years ago

I noticed this odd behavior in Kirby's Dream Land 3 (USA).sfc while running Snes9x 1.51.1:

Snes9x 1.51.1 Screenshot For reference, this scene can be found through the first door, on the first level.

KDL3 alternates between standard low-res mode and high-res mode, depending on whether the current scene has any translucent elements (in this scene, the floating leaves in the foreground are translucent, as achieved via dithering).

When the game enters high-res mode on Snes9x, the low-res portion of the gameplay scene shifts left by a single column of pixels. This can be observed in the above screenshot in two ways: the more obvious is the single column of black pixels on the right, the less obvious being the truncated leftmost-column, which is a single (high-res) pixel wide instead of two (=one low-res pixel).

The same scene as rendered by bsnes: bsnes Screenshot

I brought this issue up in IRC, but figured I should note it somewhere more permanent, also.

bearoso commented 7 years ago

Known issue. We had it right at one time, but it broke something else. You'll have to ask OV2 why it was switched back.

OV2 commented 7 years ago

We fixed it once by switching the hires plotter to the logic that was used in bsnes at the time. This caused blurry fonts in marvelous (http://www.snes9x.com/phpbb3/viewtopic.php?f=6&t=5639&p=32993s#p32993), which is why it was switched back to anomies code, which shifts one pixel to the left. There is commented code that should do anomies version without shifting, but since it has two additional branches for every pixel I left it commented out.

bearoso commented 7 years ago

I've been messing with it, and let me know your thoughts:

First, is the check for the right column necessary? It should just write the next line, which will be overwritten anyway.

Second, are you sure you're getting branches generated? If so, we can try ternary ops to get it to generate a compare and multiplication. Here's what I've got, and it doesn't throw any problems:

#define DRAW_PIXEL_H2x1(N, M) \
    if (Z1 > GFX.DB[Offset + 2 * N] && (M)) \
    { \
        GFX.S[Offset + 2 * N + 1] = MATH(GFX.ScreenColors[Pix], GFX.SubScreen[Offset + 2 * N], GFX.SubZBuffer[Offset + 2 * N]); \
        GFX.S[Offset + 2 * N + 2] = MATH((GFX.ClipColors ? 0 : GFX.SubScreen[Offset + 2 * N + 2]), GFX.RealScreenColors[Pix], GFX.SubZBuffer[Offset + 2 * N]); \
        GFX.S[Offset + 2 * N] = ((Offset + 2 * N) % GFX.RealPPL == 0) ? MATH((GFX.ClipColors ? 0 : GFX.SubScreen[Offset + 2 * N]), GFX.RealScreenColors[Pix], GFX.SubZBuffer[Offset + 2 * N]) : GFX.S[Offset + 2 * N]; \
        GFX.DB[Offset + 2 * N] = GFX.DB[Offset + 2 * N + 1] = Z2; \
    }
OV2 commented 7 years ago

I think I never actually checked the generated code, tile.cpp isn't all that debug friendly. The main concern with the right column was the end of the buffer, I'm not sure if it is always large enough that we can write past the last line. If that's the case I see no other necessity for the check. Did you try it with double width double height?

bearoso commented 7 years ago

Ok, I've benchmarked everything, and turns out no branches are being generated with if, but the modulus ops in the comparisons are killing things (600fps to 550fps on SD3 title screen). If I replace the % GFX.RealPPL in both cases with & 0x01ff there is absolutely no performance penalty.

bearoso commented 7 years ago

So if it was

#define DRAW_PIXEL_H2x1(N, M) \
    if (Z1 > GFX.DB[Offset + 2 * N] && (M)) \
    { \
        GFX.S[Offset + 2 * N + 1] = MATH(GFX.ScreenColors[Pix], GFX.SubScreen[Offset + 2 * N], GFX.SubZBuffer[Offset + 2 * N]); \
        if (((Offset + 2 * N ) & 0x01ff) != (SNES_WIDTH - 1) << 1) \
            GFX.S[Offset + 2 * N + 2] = MATH((GFX.ClipColors ? 0 : GFX.SubScreen[Offset + 2 * N + 2]), GFX.RealScreenColors[Pix], GFX.SubZBuffer[Offset + 2 * N]); \
        if (((Offset + 2 * N) & 0x01ff ) == 0) \
            GFX.S[Offset + 2 * N] = MATH((GFX.ClipColors ? 0 : GFX.SubScreen[Offset + 2 * N]), GFX.RealScreenColors[Pix], GFX.SubZBuffer[Offset + 2 * N]); \
        GFX.DB[Offset + 2 * N] = GFX.DB[Offset + 2 * N + 1] = Z2; \
    }

Would that work? Can we assume 512 instead of GFX.RealPPL here, because otherwise the comparisons don't really make sense anyway?

OV2 commented 7 years ago

Unfortunately not, GFX.S here is the GFX.Screen allocated by the port. For example in the win port GFX.Pitch is 1032, so PPL in hires would be 516.

bearoso commented 7 years ago

So did bsnes show the same issue in Marvelous with byuu's pixel plotter? The thread never clarified that.

OV2 commented 7 years ago

As far as I recall it showed the same issue, albeit not in the accuracy core. But I can no longer find the respective thread on byuu's forum.

bearoso commented 7 years ago

Marvelous works fine with the byuu plotter here. It only shows similar gocha's issue when you merge the pixels to make a 256x224 image, which is entirely expected. http://imgur.com/a/Qc9Qq

Going to revert to byuu's plotter. It also fixes Lufia 2's double-width, double-height credits, which looked absolutely terrible.

bearoso commented 7 years ago

Ok, progressing a little further, I see Gocha's problem. Need to look into it a little more.

OV2 commented 7 years ago

Could you retry with this change: https://gist.github.com/OV2/557cde1d2b34904bbb42b3d2baba2e64 This version did not incur a very large performance hit on my pc.

bearoso commented 7 years ago

That looks good. I'm seeing only a very small performance hit ~1%.

OV2 commented 7 years ago

Should be fixed with 243ab9780657455a6f9f4be5499a69145cd6208f