tvlabs / edge264

Simple H.264 decoder
BSD 3-Clause "New" or "Revised" License
45 stars 1 forks source link

Is there are any plans to optimise for neon arm64 instructions? #9

Open freedbrt opened 3 months ago

freedbrt commented 3 months ago

This project is optimised only for SSE3 instructions? Are there any benchmarks for arm64 processors?

traffaillac commented 3 months ago

Yes! At the moment I am busy with future job applications as academic researcher, then I'll be unemployed until September and will work on making edge264 as production-ready as can be :

The project is optimized for SSSE3, with a few key speedups for SSE4.2 and AVX2. Unlike other decoders, I do not maintain many versions of each SIMD routine (e.g. for SSE2, SSSE3, AVX2, AVX512) to reduce maintenance nightmares (which do not bring much speedup anyway). So ARM support will certainly be limited to NEON instead of SVE. Unless I can find a programming language that allows writing one code version and compile it to perfect NEON and SVE assembly :)