Added simd_utils to translate x86_64 simd intrinsics to ARM
Added RUY for gemm implementation
Added dependencies: Ruy, simd_utils
How to test
Download a model and run it:
nik@nikserver2:~/enes$ ~/marian-dev/build/marian-decoder -m model1.npz --mini-batch 1 --maxi-batch 1 -v vocab.esen.spm vocab.esen.spm --quiet --quiet-translation
Hi there, I am running on an ARM PC.
Hola, estoy corriendo en un PC ARM.
TODO
Use RUY to make int8 inference on ARM
Check how to make it work on ARMv7 (Currently hardcoded to v8, might need to use a different simd-arm library)
Maybe get rid of OpenBlas? Ruy doesn't have full sgemm support but we fake it. We need Ruy for doing 8bits and FAISS requires a lot of BLAS routines (hence openblas)
More proper guards?
Currently we are not vectorising as much as we can because we are only using ARM SIMD for the SSE codepath, but we could technically do AVX. Also proper ARM intrinsics might be faster.
Description
RFC on the initial ARM port work List of changes:
Added dependencies: Ruy, simd_utils
How to test
Download a model and run it:
TODO
Checklist