nrkno / sofie-atem-connection

Sofie ATEM Connection: A Part of the Sofie TV Studio Automation System
https://github.com/nrkno/Sofie-TV-automation/
MIT License
129 stars 36 forks source link

feat: native convertRGBAToYUV422 implementation #152

Closed Julusian closed 7 months ago

Julusian commented 1 year ago

feature

When uploading still images to the ATEM, they must be YUVA422 encoded. This convetsion is being done from RGBA in JS, which is suboptimal.

This provides an compiled implementation of this function.

On some test machines I get the following improvement for a 1080p image.

i7 8650u (linux)
js:         60-80
rust-simd:  20

ryzen 7950x (linux)
js:         21 
rust-simd:  7

m1 mini (macos)
js:         19
rust-simd:  4.5

There are some tests confirming that it is producing the exact same output as before. I think the js implementation is missing some rounding (instead doing a floor), but that is not critical to resolve right now.


While this being rust has a burden, it is of very limited scope as it is only doing some image processing. This area of the library is particularly stable, with the last code change being over 3 years ago being some reformatting, with the last functional change being some refactoring almost 4 years ago.

I don't think that introducing a native component of this library is a problem either, as we already depend on a native library which was added to draw multiviewer labels. So this will not reduce platform compatibility (other than possibly needing to add some more build targets). For optimal compatibility, the js implementation remains, so if this is used on an unsupported platform it will seamlessly fallback to js.

To avoid it being a burden for all developers/contributors to this library, it has been pulled out to be its own library and repository. This will allow 99% of development of this library to be done without needing a rust compiler, with the full setup needed solely to modify the image processing code.

This is based on some work I did for https://github.com/julusian/node-image-rs which did some alterations from the default napi-rs setup, to include all the prebuilds in the one npm package.

The rust code resides at https://github.com/Julusian/atem-connection-image-tools https://www.npmjs.com/package/@atem-connection/image-tools. This may want to be adopted into the nrkno organisation and sofie set of libraries to ensure it can be maintained properly.


TODO:

codecov-commenter commented 1 year ago

Codecov Report

Patch coverage has no change and project coverage change: -0.02% :warning:

Comparison is base (af3b0b2) 86.28% compared to head (4cb8d59) 86.27%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #152 +/- ## ========================================== - Coverage 86.28% 86.27% -0.02% ========================================== Files 180 180 Lines 5812 5813 +1 Branches 957 957 ========================================== Hits 5015 5015 - Misses 775 776 +1 Partials 22 22 ``` | [Files Changed](https://app.codecov.io/gh/nrkno/sofie-atem-connection/pull/152?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=nrkno) | Coverage Δ | | |---|---|---| | [src/lib/atemUtil.ts](https://app.codecov.io/gh/nrkno/sofie-atem-connection/pull/152?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=nrkno#diff-c3JjL2xpYi9hdGVtVXRpbC50cw==) | `62.58% <0.00%> (-0.43%)` | :arrow_down: |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Julusian commented 1 year ago

Reworking it to use simd intrinsic in rust brings the execution down to ~20ms. I feel like this could be optimised further, as I am having to do some type conversions at a couple of points that I suspect should be possible to avoid

Julusian commented 1 year ago

I also attempted writing this in c/c++ with simd, which brought the time down to 15ms.

For completeness, I ran the test on a couple of other machines (number in ms):

i7 8650u
js:         60-80
rust:       40
rust-simd:  20
cpp-simd:   15

7950x
js:         21 
rust:       9
rust-simd:  7
cpp-simd:   6

m1 mini
js:         19
rust:       5
rust-simd:  4.5
cpp-simd:   8.5 (non-native simd)

----
another day:

i7 8650u
js:              100-110
rust-wasm:       50-55
rust-wasm-simd:  45-50

The improvement is still there, but is less due to them being generally faster machines.

I think the m1 mini cpp-simd result is invalid, it was running the same x86_64 simd running through simde which I expect resulting in it doing some inefficient polyfilling.
This proves that making this code efficient on all platforms will be a large burden, which is avoidable with the rust.

So while the C/C++ performs better, I am an not confident that the increasingly small performance benefit is worth the maintenance/development cost of cross platform/architecture simd.
C/C++ does have the benefit of needing more standard tooling, but has the compatibility issues, as well as needing an external library unless we want to only support native simd. (x86_64 and arm64 would be enough, with a js fallback)

Rust has the benefit of being an easier ecosystem to setup, you simply need rustup installed, and everything else will be installed. It does require a nightly compiler, which could result in some needed changes in the future as the api gets stabilised. And it will only support a more limited set of platforms, based on what the compiler supports and what supports the simd it is compiled with. (this should be explored, to see if it can gracefully degrade to js instead of crashing). We don't particularly benefit from the memory safety of rust here, the slice bounds checking is probably what is hurting our performance here.


I think this is worth continuing with. Probably best to continue with it being an external library and repository, so that it has minimal/no impact on development of this library directly. That library should attempt to use a native version, and fallback to js (or wasm?) implementations for optimal compatibility.

I am still not 100% sure on rust vs c++ for this, but I think the ease of compatibility from rust is worth the slight performance penalty. Hopefully this will help keep needed maintenance to a minimum.

Julusian commented 7 months ago

Replaced by https://github.com/nrkno/sofie-atem-connection/pull/160