dolphin-emu / hwtests

Hardware test suite
GNU General Public License v2.0
22 stars 25 forks source link

Add copy filter test #44

Closed Pokechu22 closed 2 years ago

Pokechu22 commented 2 years ago

See dolphin-emu/dolphin#10439. With that PR, Dolphin passes this test when FULL_COPY_FILTER_COEFS and FULL_GAMMA are false (I'll address the other cases in a separate Dolphin PR). This test passes in all cases on real hardware. dolphin-emu/dolphin#10466 fully passes this test, as well.

The test runs through all values for the red channel in an RGB8 EFB (i.e. 0-255), copy filter values from 0 to 63*3 (setting the middle 3 registers that affect the current line, but not testing the actual behavior for blurring together lines as the whole EFB is the same color), and the gamma option on EFB copies (including the invalid value, which behaves the same as 2.2). It also includes code that correctly predicts the outcome of all of these configurations.

Pokechu22 commented 2 years ago

I've added a second test that checks the result of using RGBA8 EFB copies with intensity formats (libogc has GX_CTF_YUVA8, which does this, but Dolphin didn't implement it until now). Along with some linear regression, I was able to compute the exact coefficients used for converting to YUV, and the test now accurately predicts the YUV values for any RGB color, and I was able to also have Dolphin accurately match that test. Raw data and the script I used are in intensity.zip (note that the extracted raw data is ~350 MB, in intensity2.7z inside of that zip file).

The copy filter and gamma are applied before conversion from RGB to YUV.

Pokechu22 commented 2 years ago

To more explicitly document the process, to my current knowledge:

  1. The previous, current, and next rows from the EFB are sampled. (The test currently only covers the current row)
  2. The R, G, and B channels of those are each multiplied by the copy filter values for the previous, current, and next rows, and those values are summed together.
  3. The sum is shifted right by 6 (i.e. divide by 64 and round down); the copy filter has now been applied.
  4. The sum is masked to 9 bits (i.e. in the range [0, 511]).
  5. The sum is clamped into the range [0, 255] (so values greater than 255 but less than 512 (due to the above masking) end up being 255).
  6. Gamma is applied, by setting the red, green, and blue channels to round(pow(value / 255.0, 1/gamma) * 255.0). In this case, since I used std::round, .5 ends up rounding away from 0 (i.e. to 1), though I don't think that situation ever comes up here as pow won't result in a value ending in .5 after being multiplied by 255.
  7. The YUV/intensity conversion occurs if enabled, producing

    y = round( 0.2578125f * r + 0.50390625f * g + 0.09765625f * b + 16);
    u = round(-0.1484375f * r + -0.2890625f * g +  0.4375f * b + 128);
    v = round( 0.4375f * r + -0.3671875f * g + -0.0703125f * b + 128);

    Here, std::round must round .5 to 1; for instance, r = g = b = 32 produces y = round(27.5) = 28.

    These numbers look highly precise, but I've just now realized that they are equivalent to:

    y = round(( 66 * r + 129 * g +  25 * b) / 256.0 + 16);
    u = round((-38 * r + -74 * g + 112 * b) / 256.0 + 128);
    v = round((112 * r + -94 * g + -18 * b) / 256.0 + 128);

    which looks a lot simpler and a lot less arbitrary. Note that r, g, and b range from 0 to 255.

  8. The alpha value is always the alpha value from the current row.
Pokechu22 commented 2 years ago

I've got this test to the point where I'm happy with the functionality. There are actually 2 new tests:

I've also made some rather large structural changes here. Specifically, I've added a dependency on fmt, and updated BPMemory.h to match Dolphin's version of it (also adding in the various dependencies and the updated version of BitField.h). This did require updating every other test, but I think it's worth it; it makes implementing this test slightly easier, and makes writing tests slightly easier in general since there's no need to worry about printf's strangeness, and enums can be formatted via EnumFormatter.

I consider these tests to be basically feature-complete; they pass on real hardware and on Dolphin with my PR, but fail on earlier versions of Dolphin. They're still missing handling of other EFB formats (RGB565_Z16 hangs for some reason, and Y8/U8/V8/YUV420 don't have any documentation and behave in their own strange ways), and also of texture formats other than RGBA8, but those could be handled later on. All I need to do now is rebase and rework the history so that it's not a giant mess.

neobrain commented 2 years ago

Great job on this :)

FWIW for the copy filter, one way we could further increase test coverage beyond what's already there is to add a pass over pseudo-randomized EFB contents. That reduces the chance of accidentally only testing "nice" inputs when generating row values manually.

Pokechu22 commented 2 years ago

FWIW for the copy filter, one way we could further increase test coverage beyond what's already there is to add a pass over pseudo-randomized EFB contents.

Good idea, but I think I'll hold off on doing that for a later PR, once I understand what's going on with EFB depth pokes and can also randomize the depth values.

Pokechu22 commented 2 years ago

I've realized that this repo does have a .clang-format file, but it seems to be pretty old (I believe Dolphin targeted clang-format 3.8 at the time), and using a more modern version of clang-format gives different results. As such I'm ignoring linting issues for now (trying to update clang-format in this repo would just add even more noise to this PR).

Pokechu22 commented 2 years ago

While looking into something unrelated, I also found that GX_SetPixelFmt is a bit more complicated than I thought; it does more than simply writing to BPMEM_ZCOMPARE (0x43, peCntrl per this); it also updates the genmode to enable multisampling for GX_PF_RGB565_Z16 (which probably doesn't explain EFB pokes breaking, but does explain other properties), and writes to bits 10 and 11 (the purpose of which is unknown and not documented by Dolphin) of BPMEM_CONSTANTALPHA (0x42, peCMode1) for GX_PF_Y8, GX_PF_U8, GX_PF_V8, as well as remapping Y8/U8/V8 to 4 and YUV420 to 5 (leaving pixel formats 6 and 7 completely unaccounted for). So that's something to look into in the future, though I don't know if any games actually use those pixel formats.

While looking even further I've found patents US7075545B2 and JP4683760B2 which talk about EFB copies. YAGCD doesn't mention them at all (possibly because publication of them was delayed until 2006 and 2011 respectively, though I don't know if "publication" has the normal meaning with patents or not). The patents have a table (page 40 on the US PDF and 25 on the Japanese PDF) that explains what various configurations do what, but the table has really jank alignment and line wrapping so it doesn't make much sense. It also does give the same order of operations I described in https://github.com/dolphin-emu/hwtests/pull/44#issuecomment-1046314390 (column 15 of the US patent), though it doesn't mention Z, and doesn't mention the overflow behavior or anything with the values not summing to 64. It also gives RGB to YCrCb(4:4:4) conversion coefficients to 3 digits, which match Dolphin's old behavior but not the integer values I got from hardware testing. It also talks about (column 12 lines 35-60, and column 19 lines 26-37) the YUV EFB formats being intended for MPEG decoding, and the automatic conversion also applying from YUV -> RGB, but I don't know if anything uses that. One of the figures (12 a) also seems to explain that the copy filter uses 3 values for the center row but 2 for the above and below ones because in anti-aliased mode it uses 3 samples for the center row but 2 for the ones above and below.

neobrain commented 2 years ago

While looking into something unrelated, I also found that GX_SetPixelFmt is a bit more complicated than I thought; it does more than simply writing to BPMEM_ZCOMPARE (0x43, peCntrl per this); it also updates the genmode to enable multisampling for GX_PF_RGB565_Z16 (which probably doesn't explain EFB pokes breaking, but does explain other properties), and writes to bits 10 and 11 (the purpose of which is unknown and not documented by Dolphin) of BPMEM_CONSTANTALPHA (0x42, peCMode1) for GX_PF_Y8, GX_PF_U8, GX_PF_V8, as well as remapping Y8/U8/V8 to 4 and YUV420 to 5 (leaving pixel formats 6 and 7 completely unaccounted for). So that's something to look into in the future, though I don't know if any games actually use those pixel formats.

I hadn't noticed this PR used GX_ functions instead of building on CGX_, but you're running into similar issues that made me decide to avoid GX for hwtests.

I created CGX since libogc's GX isn't just a plain wrapper around the MMIO interface but rather adds a lot of implicit behavior that makes testing code rather unpredictable at times. Many of its API functions group register writes where you wouldn't expect them, and sometimes it defers register writes to a later point (with no clear pattern as to which function do so).

For these reasons, I advise against writing new tests with GX at all. If needed, it should be straightforward to extend CGX with new functions.

Pokechu22 commented 2 years ago

I hadn't noticed this PR used GX functions instead of building on CGX, but you're running into similar issues that made me decide to avoid GX for hwtests.

It actually doesn't, apart from functions related to EFB peeks and pokes which immediately read/write to the relevant PE registers or the EFB itself. I was making note of this more because it indicates additional things that we might need to write to if we want to experiment with those additional modes, and because the relevant enum values from BPMemory.h are possibly incorrect too.

It might make sense to add the EFB peek/poke functions to CGX (especially if it turns out the GX ones are just broken with regards to some pixel formats), but I don't think we need to do that just yet.

neobrain commented 2 years ago

It actually doesn't, apart from functions related to EFB peeks and pokes which immediately read/write to the relevant PE registers or the EFB itself. I was making note of this more because it indicates additional things that we might need to write to if we want to experiment with those additional modes, and because the relevant enum values from BPMemory.h are possibly incorrect too.

Ah right, apologies; I misread your comment then. When glancing over the diff again I spotted the peeks/pokes and assumed by extrapolation there'd be more :)

It might make sense to add the EFB peek/poke functions to CGX (especially if it turns out the GX ones are just broken with regards to some pixel formats), but I don't think we need to do that just yet.

Probably still makes sense to at least wrap them in a CGX API for the sake of explicitly "blessing" those functions as having been reviewed and as being safe to use.

Pokechu22 commented 2 years ago

Probably still makes sense to at least wrap them in a CGX API for the sake of explicitly "blessing" those functions as having been reviewed and as being safe to use.

Done. I've also added a parameter for the current pixel format to those wrappers, which currently goes unused but might be relevant in the future (e.g. if it turns out that RGB565_Z16 requires 16-bit reads/writes instead of 32-bit reads/writes)

Pokechu22 commented 2 years ago

I've modified the copy filter test to use Z-textures to make the depth copy test more complete (suggested by @phire).

This was a bit of a struggle because for some reason, the draw wasn't working at first on real hardware (but it did work on Dolphin). I eventually determined that this was due to use of VA_TYPE_POS_XY instead of VA_TYPE_POS_XYZ; changing to XYZ and specifying a valid depth value fixed it. Perhaps this was because the Z component was getting some other value that was out of the 0-1 range and clipping was removing the draw (I never changed the position matrix, so it might still use Z). I've confirmed that the new version of the test passes on both Dolphin and real hardware. (Unsquashed commits for this are here, but I've left the Z-texture commit as a separate one for ease of reversion if Z-pokes ever become usable.)

phire commented 2 years ago

Perhaps this was because the Z component was getting some other value that was out of the 0-1 range and clipping was removing the draw

I'm really getting the impression that anywhere that dolphin defaults a value to 0.0 or 1.0 because something is disabled is wrong.

Perhaps we should be giving dolphin a "homebrew mode" that logs errors whenever a CP/XF/BP mem configuration would result in uninitialised values being read.