JvanKatwijk / dab-cmdline

DAB decoding library with example of its use
GNU General Public License v2.0
57 stars 29 forks source link

viterbi: Add NEON SIMD using SSE2NEON #41

Closed athoik closed 6 years ago

athoik commented 6 years ago

It seems that SSE2NEON can sucessfully convert all SSE2 instructions to NEON.

This commit improves viterbi decoding speed more than 25%.

It was tested using verify_viterbi (infastructure files from spiral.net).

Using scalar C the decoder speed was 2719.79 kbits/s. Using SSE2NEON with SSE 4-way the decoder speed is 3447.81 kbits/s.

In order to use it we need to include spiral-neon.h to CMakeLists.txt Add the following definitions again to CMakeLists.txt

if(DEFINED NEON_AVAILABLE) add_definitions(-DNEON_AVAILABLE) endif ()

And finally compile it using -DNEON_AVAILABLE flag.

JvanKatwijk commented 6 years ago

It does not compile on a Stretch debian distro on an RPI2

2018-02-24 13:40 GMT+01:00 Athanasios Oikonomou notifications@github.com:

It seems that SSE2NEON can sucessfully convert all SSE2 instructions to NEON.

This commit improves viterbi decoding speed more than 25%.

It was tested using verify_viterbi (infastructure files from spiral.net).

Using scalar C the decoder speed was 2719.79 kbits/s. Using SSE2NEON with SSE 4-way the decoder speed is 3447.81 kbits/s.

In order to use it we need to include spiral-neon.h to CMakeLists.txt Add the following definitions again to CMakeLists.txt

if(DEFINED NEON_AVAILABLE) add_definitions(-DNEON_AVAILABLE) endif ()

And finally compile it using -DNEON_AVAILABLE flag.

You can view, comment on, or merge this pull request online at:

https://github.com/JvanKatwijk/dab-cmdline/pull/41 Commit Summary

  • viterbi: Add NEON SIMD using SSE2NEON

File Changes

Patch Links:

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/JvanKatwijk/dab-cmdline/pull/41, or mute the thread https://github.com/notifications/unsubscribe-auth/AITzwGNxLIBmxDqwkyPrBVv6NiN5VME7ks5tYAMxgaJpZM4SR4M3 .

-- Jan van Katwijk

+31 (0)15 3698980 +31 (0) 628260355

athoik commented 6 years ago

What parameters are you using?

I am using open embedded (cross compile to arm) using the following parameters to GCC:

arm-oe-linux-gnueabi-gcc -march=armv7-a -mfpu=neon -mfloat-abi=hard --sysroot=...

What error do you get?

JvanKatwijk commented 6 years ago

an issue with a macro expansion in SSE2NEON.h, some apparent inconsistency with a macro definition in arm-neon.h

2018-02-24 22:19 GMT+01:00 Athanasios Oikonomou notifications@github.com:

What parameters are you using?

I am using open embedded (cross compile to arm) using the following parameters to GCC:

arm-oe-linux-gnueabi-gcc -march=armv7-a -mfpu=neon -mfloat-abi=hard --sysroot=...

What error do you get?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/JvanKatwijk/dab-cmdline/pull/41#issuecomment-368261125, or mute the thread https://github.com/notifications/unsubscribe-auth/AITzwKCMIPwBIJ4ktzLygXxCMbaew1sXks5tYHz9gaJpZM4SR4M3 .

-- Jan van Katwijk

+31 (0)15 3698980 +31 (0) 628260355

JvanKatwijk commented 6 years ago

n file included from SSE2NEON.h:123:0, from spiral-neon.c:27: SSE2NEON.h: In function ‘_mm_setzero_si128’: /usr/lib/gcc/arm-linux-gnueabihf/6/include/arm_neon.h:5792:1: error: inlining failed in call to always_inline ‘vdupq_n_s32’: target specific option mismatch vdupq_n_s32 (int32_t __a) ^~~ In file included from spiral-neon.c:27:0: SSE2NEON.h:312:33: note: called from here SSE2NEON.h:230:2: (x)


SSE2NEON.h:312:33:
  return vreinterpretq_m128i_s32(vdupq_n_s32(0));

SSE2NEON.h:230:3: note: in definition of macro ‘vreinterpretq_m128i_s32’
  (x)

is the error. I really do not have a clue how to handle it

2018-02-24 22:26 GMT+01:00 jan van katwijk <j.vankatwijk@gmail.com>:

> an issue with a macro expansion in SSE2NEON.h, some apparent inconsistency
> with a macro definition in arm-neon.h
>
>
> 2018-02-24 22:19 GMT+01:00 Athanasios Oikonomou <notifications@github.com>
> :
>
>> What parameters are you using?
>>
>> I am using open embedded (cross compile to arm) using the following
>> parameters to GCC:
>>
>> arm-oe-linux-gnueabi-gcc -march=armv7-a -mfpu=neon -mfloat-abi=hard
>> --sysroot=...
>>
>> What error do you get?
>>
>> —
>> You are receiving this because you modified the open/close state.
>> Reply to this email directly, view it on GitHub
>> <https://github.com/JvanKatwijk/dab-cmdline/pull/41#issuecomment-368261125>,
>> or mute the thread
>> <https://github.com/notifications/unsubscribe-auth/AITzwKCMIPwBIJ4ktzLygXxCMbaew1sXks5tYHz9gaJpZM4SR4M3>
>> .
>>
>
>
>
> --
> Jan van Katwijk
>
>
> +31 (0)15 3698980 <+31%2015%20369%208980>
> +31 (0) 628260355 <+31%206%2028260355>
>

-- 
Jan van Katwijk

+31 (0)15 3698980
+31 (0) 628260355
athoik commented 6 years ago

Hi,

Did you enable the NEON GCC flags?

https://community.arm.com/tools/b/blog/posts/arm-cortex-a-processors-and-gcc-command-lines

JvanKatwijk commented 6 years ago

Oops, that was the trick

2018-02-26 18:41 GMT+01:00 Athanasios Oikonomou notifications@github.com:

Hi,

Did you enable the NEON GCC flags?

https://community.arm.com/tools/b/blog/posts/arm-cortex- a-processors-and-gcc-command-lines

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/JvanKatwijk/dab-cmdline/pull/41#issuecomment-368585112, or mute the thread https://github.com/notifications/unsubscribe-auth/AITzwFG6D3Lybo2Vl0N1ciVvJtEFrWaNks5tYuy2gaJpZM4SR4M3 .

-- Jan van Katwijk

+31 (0)15 3698980 +31 (0) 628260355