Closed WMXZ-EU closed 7 years ago
Could you please share the compiler flags used to compile arm_fir_decimate_q15 routine.
arm-none-eabi-gcc -mcpu=cortex-m4 -march=armv7e-m -mthumb -mlittle-endian -mfloat-abi=hard -mfpu=fpv4-sp-d16 -O3 -ffunction-sections -fdata-sections -g -DMK66FX1M0 -DARM_MATH_CM4 -D__FPU_PRESENT -DUSB_SERIAL -DLAYOUT_US_ENGLISH -DTEENSYDUINO -DARDUINO=10600 -DF_CPU=240000000 -I../src -I"C:\Users\Walter\CMSIS\Core\Include" -I"C:\Users\Walter\CMSIS\DSP\Include" -I../uSD/src -I"C:\Users\Walter\Documents\GitHub\cores\teensy3" -I"C:\Users\Walter\Documents\Arduino\libraries\wmxzCore\src" -std=gnu11 -Wa,-adhlns="src/testFir.o.lst" -MMD -MP -MF"src/testFir.d" -MT"src/testFir.o" -c -o "src/testFir.o" "../src/testFir.c"
I compile it for a MK66FX1M0 (Teensy3.6 by PJRC.com)
BTW Library was downloaded 25thOctober
I copied the _q15,_q31, _f32 versions into testFir.c and got for 240 MHz clock and 256 points of data with a 129 point FIR with decimation of 8 the following times in microseconds
type library compiled
q15 129 66
q31 132 137
f32 101 109
What are the compiler flags for generating the libraries?
The compiler flags used to build the GCC M4lf DSP library are: -mcpu=cortex-m4 -mthumb -gdwarf-2 -MD -Wall -O3 -fno-strict-aliasing -ffunction-sections -fdata-sections -mfpu=fpv4-sp-d16 -mfloat-abi=hard -ffp-contract=off -DARMCM4_FP -DARM_MATH_CM4 -DARM_MATH_MATRIX_CHECK -DARM_MATH_ROUNDING -DUNALIGNED_SUPPORT_DISABLE -D__FPU_PRESENT="1U"
You can also check the uVision project to build the GCC DSP libraries.
The project is part of CMSIS: C:\Keil\ARM\PACK\ARM\CMSIS
adding -gdwarf-2 -fno-strict-aliasing -ffp-contract=off to the compiler flags, got execution times of my compilation equal to library except decimate_q15 where execution time is half the execution time of the library function. (consequently, there is no need for library. THANKS for helping. So I close Issue)
Using the arm_fir_decimate_q15 routine in libarm_cortexM4lf_math.a is about 2 times slower than the same routine compiled by user. other decimate routines (q31, f32) are very similar ( library routines are slightly faster than application compiled, most likely due to different compiler flags)