dubhater / vapoursynth-mvtools

Motion compensation and stuff
181 stars 27 forks source link

aarch64 build fix and asm optimization (only autotools) #73

Closed Stefan-Olt closed 1 month ago

Stefan-Olt commented 1 month ago

This contains build fixes and asm (NEON) optimizations for aarch64, tested on Linux (Raspberry Pi 3) and macOS (Apple Silicon M1 Max). It uses the the NEON assembly code taken from x264 like the x86 code, additionally the the SSE2 code using intrinsics in mvtools are converted to NEON using sse2neon. This may not be the best performing solution, but it still gives a total speed-up between 2x and 4x in a real world scenarios