catid / gf256

GF256 - Fast 8-bit Galois Field Math in C
BSD 3-Clause "New" or "Revised" License
50 stars 11 forks source link

Various fixes for ARM/NEON (and some additions) #4

Closed Buanderie closed 6 years ago

Buanderie commented 6 years ago

First, using NEON, gf256_muladd_mem didn't pass self-tests. Fixed by replacing vshrq_n_u8 by vshrq_n_u64, since it's the NEON equivalent of _mm_srli_epi64, not vshrq_n_u8.

Also needed to add uint8_t* casts everywhere when calling vld1q_u8 or vst1q_u8. Dirty, but it works. Kept my compiler happy. I'm not sure it's the right thing to do, though.

Last, vqtbl1q_u8 is only available if ASIMD is available. Added a NEON equivalent. That should be slower than its ASIMD counterpart of course, but still is faster than no NEON optimization (more than 2x speed increase on my Freescale SoC).

Also added support of runtime NEON support check for LINUX_ARM platform (macro name might be misleading, maybe find another one).

Since I could not test on Android or IOS, I'm not sure if this breaks anything...

catid commented 6 years ago

LGTM I'll try to run it on Android/iOS this weekend and get that working too (if any changes are needed)

catid commented 6 years ago

It sounds like a cool project with some low-power devices. So what are you working on if you don't mind me asking?

Buanderie commented 6 years ago

Long range communication from a quadcopter. Real-time video streaming (simple cauchy reed solomon for now) and snapshot retrieval (LT fountain codes)... I'm still looking for better alternatives especially for the video streaming part. I'm trying to get siamese to work for this purpose. Don't have much time to look into it :) Is there any use for FEC at Facebook/Oculus, btw ? Or are you coding this mostly for fun ?

catid commented 6 years ago

Currently not using or developing any of my software at work. It's a hobby

On Fri, Jul 28, 2017 at 11:58 AM Buanderie notifications@github.com wrote:

Long range communication from a quadcopter. Real-time video streaming (simple cauchy reed solomon for now) and snapshot retrieval (LT fountain codes)... I'm still looking for better alternatives especially for the video streaming part. I'm trying to get siamese to work for this purpose. Don't have much time to look into it :) Is there any use for FEC at Facebook/Oculus, btw ? Or are you coding this mostly for fun ?

— You are receiving this because you modified the open/close state.

Reply to this email directly, view it on GitHub https://github.com/catid/gf256/pull/4#issuecomment-318735964, or mute the thread https://github.com/notifications/unsubscribe-auth/AAPZIeVUbqT2v1KA-_rS55Ztb9glQkk9ks5sSi9kgaJpZM4OmSlq .

Buanderie commented 6 years ago

I see... Well I hope my contributions have been useful :)

catid commented 6 years ago

Yeah absolutely I wasn't planning on testing the platform you're on, so that's a great contribution! You rock! =)