Closed fwcd closed 3 months ago
cc @acolombier
This band-aid solution looks good to me. However we should fix the route cause. Instead of making the GPU swapping the bytes all the time, we should pass the original colors in the order of the GPU at one time. Can you file an issue for it? Maybe also a table that indicates when the issue happens.
This band-aid solution looks good to me. However we should fix the route cause. Instead of making the GPU swapping the bytes all the time, we should pass the original colors in the order of the GPU at one time. Can you file an issue for it? Maybe also a table that indicates when the issue happens.
Not sure I understand you. This is no issue here as such. Reverse the color will swap the order of the RGB components, while endianess will swap the order of the bits. This is due to the fact that some some devices (like the s4 mk3) needs both. Making the byte transformation in the transform function will take about 30ms vs <1ms when done in the GPU on my hardware
Reverse the color will swap the order of the RGB components, while endianess will swap the order of the bits.
This is the same, since each color has one byte.
I have not fully understand the issue, but my original idea was to have one version with swapped colors so that the wrong endianess will swap it back without extra time.
like the s4 mk3 needs both.
Interesting, where is border, how does the code deal with it?
This is the same, since each color has one byte.
No it doesn't. It have 5, 6 and 5 bits split over 2 bytes.
Ah OK, I have dived through the code and understand now. The only solution is see is to render the picture natively with ES and than use QImage to convert the pixel format.
We have anyway here two deep copies here:
We may consider to let the GPU do the mirroring and we also consider to use two QImages to get around the memory allocation in copy()
What do you think?
We may consider to let the GPU do the mirroring
Yes, definitely. I didn't know how to do so with OpenGL, thus why this nasty QImage hack, but 100% could use some help to do that properly.
use two QImages to get around the memory
The .copy()
is a bit tricky: when running with TSAN, there was some flagging about the const QImage&
being passed across thread (since the connection maybe across multiple threads). I'm not sure QImage is the best object to emit there, but also could think of a lightweight solution. Perhaps QPixmap
and its implicit shared data would be a better fit for purpose here, if we don't need any QImage manipulation here?
Even with another container the frame rate speed reallocation issue still applies. A clean workaround would be a two image solution. Than just pass a const the pointer the the latest version and use a mutex to prevent updating the new image while accessing it in another thread.
Than just pass a const the pointer the the latest version and use a mutex to prevent updating the new image while accessing it in another thread.
That's what I meant when I mentioned "no lightweight solution" :) There may be multiple threads connecting to the signal (e.g GUI+Controller), meaning you will need some kind of refcounter to unlock the mutex, which
Just for context as I'm not sure if you had a chance to check the original PR, but this was a large piece of work, and took me many months of daily involvement to get it over the line, so I didn't look into macro optimisations in order to contain the amount of change and potential issues.
Not saying there is no flaws here, it can definitely be better, but currently the frame rendering on my laptop is around 100us, well under the 15ms needed for 60fps, so it might be worth not putting the effort of making that refactor till we get user reporting issues with performance.
Some pixel formats are unavailable in OpenGL ES, so I patched them out and added corresponding errors. If there is a better way to handle this without disabling the entire controller renderer (which largely seems to use APIs that are available in OpenGL ES), let me know.