dracc / NevolutionX

Original Xbox dashboard created with nxdk
MIT License
121 stars 21 forks source link

menu: Makes XBE scanner cooperative instead of async. #106

Closed abaire closed 2 years ago

abaire commented 2 years ago

Testing on real hardware, I found that when calling FindFirstFile on a secondary thread stalls for 20+ seconds (on my v1.0 box). Specifically the underlying call to NtOpenFile blocks for a substantial amount of time. This seems to be consistent for any (existing) directory, regardless of the content, and changing to an explicit CreateThread based approach instead of using std::thread did not resolve the problem, nor did changing the sync-related flags on the NtOpenFile call.

Until I can figure out why this is happening, I've converted XBEScanner to operate in a cooperative manner. The main loop polls the scanner and gives an approximate time limit for the scan. Theoretically this timeout is based on framerate, but it seemed like I was only getting ~20fps on hardware, so I've made it scan at least one file per frame and set the target to 15 fps.

I also added some additional debugging output so it's clear how much time it takes to scan. I suspect users with large libraries will be pretty unhappy with the current performance, so hopefully there's some easy improvements to be had.

dracc commented 2 years ago

Oh, 20fps? Guess I should speed up my hardware based renderer work soon.

abaire commented 2 years ago

Yeah, hardware rendering would be excellent :)

What I'm seeing is suspiciously steady so I'm wondering if it's a combination of the yield call and vsync causing it to be low. On xemu if I remove the yield it jumps up to ~60 fps, but on hardware it seems to consistently stabilize at ~15 fps. I'll continue to poke around later to see if there's something obvious.

(Also just added #107 to make it easier to watch on hardware)

abaire commented 2 years ago

I think there's something more interesting going on with the threaded performance problem (I can't repro in a trivial test app), moving this to draft.

abaire commented 2 years ago

Did some more debugging and I think this is just thread starvation. My test app failed to reproduce because I was using pbkit, once I switched to using SDL (and left out the yield to simulate more costly rendering), I got the same behavior.

I'm going to drop this PR and instead have a temporary fix that just keeps the scanner on the main thread. It will mean longer startup times (on my XBOX with 160 XBEs to scan it takes ~8 seconds), but that's a much better user experience than having a navigable menu but having to wait ~2 minutes for everything to scan.

I'm optimistic that moving to HW accelerated rendering will allow us to turn threading back on and get the best of both worlds.