alex-ong / NESTrisOCR

OCR for statistics in NESTris
24 stars 7 forks source link

OSX: Convert window capture to RGB at source #10

Closed timotheeg closed 4 years ago

timotheeg commented 4 years ago

The RGBA window capture in OSX is confusing njit (RGBA pixels tuples have 4 entries, while RGB tuples only 3)

I had initially added the convert('RGB') into ImageCanvas, but best to do it at the source at capture time in OSX for compatibility with everything else thereafter.

The Calibrator app still works, and njit gives the same performance boost in OSX as it does in windows 👍

In case you wonder, I did try to set the raw format supplied to Image.frombuffer() to RGB but that doesn't work. Conversion must be done after capture :/

alex-ong commented 4 years ago

Nice; do you know the performance hit of RGBA -> RGB? i'm guessing its C++ memcpy which should be <1ms, but just wondering.

timotheeg commented 4 years ago

I just ran some measurements to capture a window of size (950, 750) on my computer. For a window of that size, capture takes 7 to 8ms, and convert('RGB) takes between 0.5ms and 1ms. It's quite fast but non-negligible :(.

alex-ong commented 4 years ago

Ahh ok, and quartz capture has a single threaded lock so calling it from 4 different processes with tiny segments just locks it and takes 32ms instead of 1-4ms simultaneously right?

timotheeg commented 4 years ago

I believe so, yeah (educated guess based on the numbers, I did zero actual research 😅). But PIL is very fast, so that's why in the previous version of NESTrisOCR, I could make it single capture and then cut pieces with PIL. On my old mac (which is what I stream with), I could just make it to 16ms total processing. On that note, multithread isn't working for me on OSX, but I haven't checked why. In any case, I'll need to redo that eventually on top of master at some point.

[not related to the PR but since I'm typing]. I did have board scanning too, but only luma since I only needed to do bock counting to detect various events (basically giving me a board representation of 0s and 1s only). For NESTris99, The color detection you added is great, but to support Das Trainer, I might need to do a funny changeset 🤔. Das Trainer doesn't have piece stats, so there's no reliable location to get references for color1 and color2. I was thinking I could keep everything the same but in a das trainer mode, I'd get a color reference array as (black, white), rather than (black, white, color1, color2). That would make it not be very a nice nestris99 citizen, but would work for my setup 🤔

Or maybe another way might be to lock on a black and white region and interpolate color1 and color2 based on known color values from the current level? Hmm. I don't know how well that'd work. I might give it a try.

Am just thinking about it about for now, as usual, I'm very slow to actually do anything, so might be a while till I have any result to share 😅

alex-ong commented 4 years ago

1) PIL is indeed very fast. I was thinking of refactoring to a single thread that captures the "fullscreen" (well biggest rectangle that contains everything) into a Queue, followed by multithrads that crop out bits from it. That would result in no race conditions (currently you can get states where score+lines dont match if they were read in different game frames), and potentially better frame timing.

2) known color values doesn't work with a fixed lookup table, because every single person who captures will have different capture card + brightness/tints/hues/saturation. If you generate a lookup table (i.e. make them manually go through levels 0->9 at some point) then it would work. Using the previews is good because it skips building this lookup table.

timotheeg commented 4 years ago

well biggest rectangle that contains everything

Yep, that's exactly what I did :) And yup, removing the race condition sounds great!

For 2, It wouldn't be a strict lookup table. The code you have is already doing "find nearest color to (white, black, color1, color2)", and I'm thinking that fuzzy logic can be leveraged. It might be possible to, knowing local rendering values of black and white (these 2 would sort of capture the brightness/tints/hues/saturation settings), interpolate color1 and color2 from a reference table of colors by level. Hopefully, the computed values would still work in the "nearest color" match algorithm.

Even in current code now, color1 and color2 are computed with antialiased scaling and they produce washed out colors that work fine. I was checking the values the other day and the computed antialiased colors are quite far from the board ones, but the fuzzy match on nearest color still works beautifully.

I'll give it a go sometimes, and see what I get :)