Closed jg1uaa closed 1 year ago
Thanks @jg1uaa.
@tmiw - is this OK with you? Not sure how to test it myself on my machine.
Thanks @jg1uaa.
@tmiw - is this OK with you? Not sure how to test it myself on my machine.
I can't test right now but in theory you should be able to build with cmake -DAVX=0 -DAVX2=0
to force SSE.
That said, based on the code it doesn't look like it'll hurt anything as we've been mandating AVX for 2020 modes anyway.
Thanks for the latest commit @jg1uaa. @tmiw - feel free to merge if/when you feel the PR is OK (no rush).
OK, currently testing. So far, it looks like the proposed command does work to force SSE:
Mooneer6MBP2461:build mooneer$ cmake -DAVX=0 -DAVX2=0 ..
-- The C compiler identification is AppleClang 14.0.3.14030022
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- LPCNet version: 0.5
-- freedv-gui current git hash: 97a0df1
-- Host system arch is: x86_64
-- Looking for available CPU optimizations on an OSX system...
-- sse processor flags found or enabled.
-- Compilation date = XX20230831XX
-- Configuring done (14.0s)
-- Generating done (0.4s)
-- Build files have been written to: /Users/mooneer/devel/LPCNet/build
Mooneer6MBP2461:build mooneer$
Currently building codec2 based on this PR and will report back when ctests are done.
100% tests passed, 0 tests failed out of 138
. If curious, here are some timings from my system (2019 MBP) when forcing LPCNet to use SSE:
Start 36: test_OFDM_modem_2020_ldpc
36/138 Test #36: test_OFDM_modem_2020_ldpc ...................... Passed 0.23 sec
Start 43: test_OFDM_modem_2020B_AWGN
43/138 Test #43: test_OFDM_modem_2020B_AWGN ..................... Passed 0.08 sec
Start 76: test_freedv_api_2020_to_ofdm_demod
76/138 Test #76: test_freedv_api_2020_to_ofdm_demod ............. Passed 0.09 sec
Start 77: test_freedv_api_2020_from_ofdm_mod
77/138 Test #77: test_freedv_api_2020_from_ofdm_mod ............. Passed 0.06 sec
Start 78: test_freedv_api_2020_awgn
78/138 Test #78: test_freedv_api_2020_awgn ...................... Passed 0.53 sec
Start 79: test_freedv_api_2020B_mpp
79/138 Test #79: test_freedv_api_2020B_mpp ...................... Passed 2.86 sec
Start 98: test_freedv_reliable_text_ideal_2020
98/138 Test #98: test_freedv_reliable_text_ideal_2020 ........... Passed 22.61 sec
Start 99: test_freedv_reliable_text_awgn_2020
99/138 Test #99: test_freedv_reliable_text_awgn_2020 ............ Passed 23.28 sec
Start 100: test_freedv_reliable_text_fade_2020
100/138 Test #100: test_freedv_reliable_text_fade_2020 ............ Passed 24.76 sec
Anyway, merging now. Thanks @jg1uaa for the contribution!
vec_avx.h already has SSE(4.1) support, so no longer to use vec_sse.h.