Closed kalaluthien closed 4 years ago
Below is an workaround for someone suffering same issues:
NEON_2_SSE.h:
769 #if !defined(USE_SSE4)
770 _NEON2SSE_GLOBAL uint8x16_t vcleq_u8(uint8x16_t a, uint8x16_t b); // VCGE.U8 q0, q0, q0
771 _NEON2SSE_GLOBAL uint16x8_t vcleq_u16(uint16x8_t a, uint16x8_t b); // VCGE.U16 q0, q0, q0
772 _NEON2SSE_GLOBAL uint32x4_t vcleq_u32(uint32x4_t a, uint32x4_t b); // VCGE.U32 q0, q0, q0
773 #endif
Mega thanks for reporting! Undone this commit. To be fixed later on.
Sorry for adding these errors, I forgot to test with SSE4 enabled. I've created now a new pull request (#44) that fixes this
If anyone needs the patch to build Tensorflow, apply
https://patch-diff.githubusercontent.com/raw/intel/ARM_NEON_2_x86_SSE/pull/44.patch
Here is the patch as of about 8 PM EDT 5/18/2020:
arm_neon_2_x86_sse-use_sse4.patch.gz
tensorflow/lite/tools/make/download_dependencies.sh
patch -p1 < curl https://patch-diff.githubusercontent.com/raw/intel/ARM_NEON_2_x86_SSE/pull/44.patch
Last commit (803a3d3c44b0ce81a1b5a312fa9d61879563dbb4) introduced duplicated declaration error (when
USE_SSE4
is defined because of__SSE4_2__
is defined) for tensorflow lite runtime compilation:Because
_NEON2SSESTORAGE == static
, it follows previous non-static (_NEON2SSE_GLOBAL
) declaration.Can you check this? Sorry for not giving simple reproducible code...