Closed Buanderie closed 6 years ago
LGTM I'll try to run it on Android/iOS this weekend and get that working too (if any changes are needed)
It sounds like a cool project with some low-power devices. So what are you working on if you don't mind me asking?
Long range communication from a quadcopter. Real-time video streaming (simple cauchy reed solomon for now) and snapshot retrieval (LT fountain codes)... I'm still looking for better alternatives especially for the video streaming part. I'm trying to get siamese to work for this purpose. Don't have much time to look into it :) Is there any use for FEC at Facebook/Oculus, btw ? Or are you coding this mostly for fun ?
Currently not using or developing any of my software at work. It's a hobby
On Fri, Jul 28, 2017 at 11:58 AM Buanderie notifications@github.com wrote:
Long range communication from a quadcopter. Real-time video streaming (simple cauchy reed solomon for now) and snapshot retrieval (LT fountain codes)... I'm still looking for better alternatives especially for the video streaming part. I'm trying to get siamese to work for this purpose. Don't have much time to look into it :) Is there any use for FEC at Facebook/Oculus, btw ? Or are you coding this mostly for fun ?
— You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub https://github.com/catid/gf256/pull/4#issuecomment-318735964, or mute the thread https://github.com/notifications/unsubscribe-auth/AAPZIeVUbqT2v1KA-_rS55Ztb9glQkk9ks5sSi9kgaJpZM4OmSlq .
I see... Well I hope my contributions have been useful :)
Yeah absolutely I wasn't planning on testing the platform you're on, so that's a great contribution! You rock! =)
First, using NEON, gf256_muladd_mem didn't pass self-tests. Fixed by replacing vshrq_n_u8 by vshrq_n_u64, since it's the NEON equivalent of _mm_srli_epi64, not vshrq_n_u8.
Also needed to add uint8_t* casts everywhere when calling vld1q_u8 or vst1q_u8. Dirty, but it works. Kept my compiler happy. I'm not sure it's the right thing to do, though.
Last, vqtbl1q_u8 is only available if ASIMD is available. Added a NEON equivalent. That should be slower than its ASIMD counterpart of course, but still is faster than no NEON optimization (more than 2x speed increase on my Freescale SoC).
Also added support of runtime NEON support check for LINUX_ARM platform (macro name might be misleading, maybe find another one).
Since I could not test on Android or IOS, I'm not sure if this breaks anything...