abaire / nxdk_pgraph_tests

Tests to verify xemu handling of various pushbuffer commands
Other
12 stars 8 forks source link

Enable optimization and use a faster PNG encoder #60

Closed mborgerson closed 2 years ago

mborgerson commented 2 years ago

Brings total runtime down considerably. Including all but your depth format tests, this brings runtime down to about a minute in xemu. With the depth format tests at around 3.5 min. The new image encoder basically cuts test time in half after optimizations are enabled. I'm sure there are several more improvement opportunities, but this gets the lowest hanging fruit. Another low fruit would be syncing with upstream nxdk and enabling LTO for possible improvements.

I have not tested on hardware.

abaire commented 2 years ago

Looks like the new encoder is swapping B & R channels.

E.g., with this encoder: T-d0 0_0 0_1 0_1 0-da

But the correct output is This

mborgerson commented 2 years ago

Looks like the new encoder is swapping B & R channels.

E.g., with this encoder: T-d0 0_0 0_1 0_1 0-da

But the correct output is This

Oops

mborgerson commented 2 years ago

Looks like the new encoder is swapping B & R channels.

Fixed

abaire commented 2 years ago

Looks like the new encoder is swapping B & R channels.

Fixed

Thanks!

Can you double check that the new encoder is still faster with the swizzling? I just did a non-scientific comparison between this commit and the current main branch and it seems about 50% slower to me when switching between tests (using the same xemu build, I haven't tested timing on HW yet).

abaire commented 2 years ago

Looks like the new encoder is swapping B & R channels.

Fixed

Thanks!

Can you double check that the new encoder is still faster with the swizzling? I just did a non-scientific comparison between this commit and the current main branch and it seems about 50% slower to me when switching between tests (using the same xemu build, I haven't tested timing on HW yet).

I must've had a dirty build; just did a clean build with this PR and it's noticeably faster, as expected.