JuliaDSP / DSP.jl

Filter design, periodograms, window functions, and other digital signal processing functionality
https://docs.juliadsp.org/dev/
Other
374 stars 108 forks source link

add inbounds to src/Filters/design.jl #541

Closed Jafagervik closed 4 months ago

Jafagervik commented 5 months ago

Avoid unnecessary boundary checking

Fixes #540

codecov[bot] commented 5 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (4213e9a) 97.56% compared to head (a318e1d) 97.56%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #541 +/- ## ======================================= Coverage 97.56% 97.56% ======================================= Files 18 18 Lines 3125 3125 ======================================= Hits 3049 3049 Misses 76 76 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

ViralBShah commented 5 months ago

Is there a meaningful performance benefit?

Jafagervik commented 5 months ago

Yes. I can try to profile where the compiler turns of blundary checking, but this macro is indeed powerful, and since I'm already using my own version of DSP.jl for my master thesis on ai and signal processing in julia, I thought I'd might as well contribute! I will do some more benches as soon as possible, but I'm quite busy these days

ViralBShah commented 5 months ago

Thank you for the contribution, and good to know that you are already using this in your code.

A quick benchmark in this PR will help serve as documentation for the change. Generally, the preference is not to add @inbounds because people have used it incorrectly and aggressively in the past. But if the benefit is clear, it is definitely worthwhile to use it.

ViralBShah commented 5 months ago

@wheeheee The windows failures seem codecov upload problems. I think the change to fail on codecov errors is perhaps aggressive. Would reverting it fix the error here (which is not in the package at all)?

wheeheee commented 5 months ago

I think it's fine actually; I always expand the failed checks and prefer false positives to silent failures. Didn't comment before since I wanted to benchmark first, but that's for tomorrow...

It's weird, though, that Codecov says their server errors are fixed.

wheeheee commented 5 months ago

Did some benchmarks for Butterworth, Chebyshev1, Chebyshev2, and Elliptic, and digitalfilter. For these functions, I don't see any significant performance differences on Julia 1.10, and similarly on 1.6.

T = Float64, n = 5 ```julia julia> T = Float64; n = 5; julia> @benchmark Butterworth($T, $n) # PR BenchmarkTools.Trial: 10000 samples with 651 evaluations. Range (min … max): 189.401 ns … 3.459 μs ┊ GC (min … max): 0.00% … 91.17% Time (median): 194.009 ns ┊ GC (median): 0.00% Time (mean ± σ): 207.057 ns ± 164.544 ns ┊ GC (mean ± σ): 4.54% ± 5.34% ▃▆█▇▄▂ ▁ ███████▇▆▆▇▇▅▆▇▇▇▇▆▅▄▅▆▇▇██▇▆▆▇▆▆▆▅▆▆▄▄▅▄▄▅▅▅▄▂▄▅▄▅▄▅▄▄▄▄▂▄▃▃ █ 189 ns Histogram: log(frequency) by time 290 ns < Memory estimate: 240 bytes, allocs estimate: 3. julia> @benchmark Butterworth($T, $n) # master BenchmarkTools.Trial: 10000 samples with 651 evaluations. Range (min … max): 189.401 ns … 4.607 μs ┊ GC (min … max): 0.00% … 87.49% Time (median): 193.241 ns ┊ GC (median): 0.00% Time (mean ± σ): 207.237 ns ± 165.130 ns ┊ GC (mean ± σ): 4.50% ± 5.32% ▄██▅▃▂▁ ▁ ▁ ██████████▆▇███▇█▇▇▇▇▇▇▇█▆▆▆▇▅▆▅▅▅▅▅▅▅▅▄▄▄▆▅▅▅▄▄▅▅▄▅▄▄▄▄▄▄▃▃▄ █ 189 ns Histogram: log(frequency) by time 299 ns < Memory estimate: 240 bytes, allocs estimate: 3. julia> @benchmark Chebyshev1($T, $n, $2.1) # PR BenchmarkTools.Trial: 10000 samples with 259 evaluations. Range (min … max): 298.842 ns … 8.293 μs ┊ GC (min … max): 0.00% … 94.97% Time (median): 303.089 ns ┊ GC (median): 0.00% Time (mean ± σ): 320.355 ns ± 283.496 ns ┊ GC (mean ± σ): 3.24% ± 3.53% ▅█▇▃▁▁ ▁ ▁▂▂▂ ▂ ███████▇▆▇▆▇▇██████████▇▇▆▆▆▆▇▆▇▇▇▅▆▆▆▆▆▅▆▆▅▅▅▄▅▄▅▄▄▃▃▁▄▄▄▁▄▄ █ 299 ns Histogram: log(frequency) by time 450 ns < Memory estimate: 256 bytes, allocs estimate: 4. julia> @benchmark Chebyshev1($T, $n, $2.1) # master BenchmarkTools.Trial: 10000 samples with 264 evaluations. Range (min … max): 296.970 ns … 8.127 μs ┊ GC (min … max): 0.00% … 93.98% Time (median): 301.894 ns ┊ GC (median): 0.00% Time (mean ± σ): 317.789 ns ± 265.608 ns ┊ GC (mean ± σ): 3.06% ± 3.52% ▂▇█▆▃▁ ▁ ▁ ▁ ████████▇▅▅▄▄▅▃▅▄▇██▇███████▇▆▇▆▅▆▅▅▆▄▅▅▅▅▅▅▅▆▅▆▄▅▄▃▃▅▄▄▃▃▂▃▃ █ 297 ns Histogram: log(frequency) by time 416 ns < Memory estimate: 256 bytes, allocs estimate: 4. julia> @benchmark Chebyshev2($T, $n, $2.1) # PR BenchmarkTools.Trial: 10000 samples with 214 evaluations. Range (min … max): 351.402 ns … 5.741 μs ┊ GC (min … max): 0.00% … 90.91% Time (median): 356.542 ns ┊ GC (median): 0.00% Time (mean ± σ): 371.667 ns ± 192.131 ns ┊ GC (mean ± σ): 1.90% ± 3.49% ▅█▆▃▁ ▁▁ ▁ ███████▇▆▆▇▆▇█▆▆▆▆▅▆▆▆▇█████▇▆▆▅▆▅▆▆▅▆▅▄▅▃▄▄▄▄▄▄▄▁▃▄▄▃▃▄▅▅▄▅▅ █ 351 ns Histogram: log(frequency) by time 545 ns < Memory estimate: 320 bytes, allocs estimate: 4. julia> @benchmark Chebyshev2($T, $n, $2.1) # master BenchmarkTools.Trial: 10000 samples with 214 evaluations. Range (min … max): 352.804 ns … 5.445 μs ┊ GC (min … max): 0.00% … 90.65% Time (median): 357.477 ns ┊ GC (median): 0.00% Time (mean ± σ): 372.132 ns ± 185.588 ns ┊ GC (mean ± σ): 1.86% ± 3.49% ▅█▇▄▂▁▁▁ ▁▂ ▁ ██████████▆▆▅▆▅▅▆███▆▇▆▆▆▆▆▅▅▅▆▆▇██▇█▇▆▆▅▄▅▄▅▅▅▅▅▄▅▅▃▅▄▅▄▃▅▄▄ █ 353 ns Histogram: log(frequency) by time 493 ns < Memory estimate: 320 bytes, allocs estimate: 4. julia> @benchmark Elliptic($T, $n, $1, $2) # PR BenchmarkTools.Trial: 10000 samples with 10 evaluations. Range (min … max): 1.380 μs … 121.990 μs ┊ GC (min … max): 0.00% … 95.83% Time (median): 1.560 μs ┊ GC (median): 0.00% Time (mean ± σ): 1.658 μs ± 1.235 μs ┊ GC (mean ± σ): 0.71% ± 0.96% ▄ ▃█ █▅▂▁▂▅▂██▅▃▂▃▆▃▂▃▃▄▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂ 1.38 μs Histogram: frequency by time 2.57 μs < Memory estimate: 528 bytes, allocs estimate: 5. julia> @benchmark Elliptic($T, $n, $1, $2) # master BenchmarkTools.Trial: 10000 samples with 10 evaluations. Range (min … max): 1.370 μs … 130.560 μs ┊ GC (min … max): 0.00% … 96.74% Time (median): 1.410 μs ┊ GC (median): 0.00% Time (mean ± σ): 1.496 μs ± 1.302 μs ┊ GC (mean ± σ): 0.84% ± 0.97% ██ ███▄▂▂▁▂▄▆▄▂▁▁▃▂▃▄▃▂▁▁▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂ 1.37 μs Histogram: frequency by time 2.04 μs < Memory estimate: 528 bytes, allocs estimate: 5. ```
T = Float64, n = 20 ```julia julia> T = Float64; n = 20; julia> @benchmark Butterworth($T, $n) # PR BenchmarkTools.Trial: 10000 samples with 223 evaluations. Range (min … max): 332.287 ns … 5.996 μs ┊ GC (min … max): 0.00% … 91.17% Time (median): 337.220 ns ┊ GC (median): 0.00% Time (mean ± σ): 358.970 ns ± 249.445 ns ┊ GC (mean ± σ): 3.28% ± 4.45% ▆█▆▃▂▃▄▁ ▁▁ ▁ ████████▇▆▇████▇▆▆▇▆▆▅▆▆▄▆▆▅▅▆▆▅▄▅▅▅▆▆▆▆▃▅▅▄▅▂▄▄▃▅▆▇▆▇▆▄▄▄▃▄▄ █ 332 ns Histogram: log(frequency) by time 523 ns < Memory estimate: 496 bytes, allocs estimate: 3. julia> @benchmark Butterworth($T, $n) # master BenchmarkTools.Trial: 10000 samples with 235 evaluations. Range (min … max): 322.979 ns … 6.572 μs ┊ GC (min … max): 0.00% … 89.39% Time (median): 327.234 ns ┊ GC (median): 0.00% Time (mean ± σ): 348.547 ns ± 237.311 ns ┊ GC (mean ± σ): 3.26% ± 4.52% ▇█▅▁▁▂▃▁ ▁▂▁ ▁ ████████▇▇█████▇▇█▇▆▆▆▇▆▆▄▆▆▆▅▆▅▅▆▅▇█▇▆▆▆▅▆▅▄▅▄▄▃▄▄▃▃▄▃▄▃▃▅▄▃ █ 323 ns Histogram: log(frequency) by time 507 ns < Memory estimate: 496 bytes, allocs estimate: 3. julia> @benchmark Chebyshev1($T, $n, $2.1) # PR BenchmarkTools.Trial: 10000 samples with 199 evaluations. Range (min … max): 426.131 ns … 7.531 μs ┊ GC (min … max): 0.00% … 91.70% Time (median): 433.166 ns ┊ GC (median): 0.00% Time (mean ± σ): 458.110 ns ± 299.829 ns ┊ GC (mean ± σ): 2.96% ± 4.26% ▅█▆▄▃▂▁ ▁ ▁ ███████▇▇▇▆▇██▇▇▇▇▇▇▆▆▆▄▆▆▅▇██▇▇▅▆▅▄▃▄▅▄▄▄▂▂▃▄▃▃▃▃▄▄▃▄▂▄▄▃▃▃▅ █ 426 ns Histogram: log(frequency) by time 675 ns < Memory estimate: 512 bytes, allocs estimate: 4. julia> @benchmark Chebyshev1($T, $n, $2.1) # master BenchmarkTools.Trial: 10000 samples with 199 evaluations. Range (min … max): 432.663 ns … 7.446 μs ┊ GC (min … max): 0.00% … 90.91% Time (median): 439.698 ns ┊ GC (median): 0.00% Time (mean ± σ): 463.778 ns ± 284.656 ns ┊ GC (mean ± σ): 2.77% ± 4.23% ▅█▇▄▄▃▂▁ ▁▂ ▂▁ ▂ ████████████▇████▇▇▇▇▇▇▇▇▆▅▆▆▇█████▇▇▇▅▅▅▄▆▅▆▅▅▄▁▅▅▃▃▄▄▅▃▄▄▃▄ █ 433 ns Histogram: log(frequency) by time 656 ns < Memory estimate: 512 bytes, allocs estimate: 4. julia> @benchmark Chebyshev2($T, $n, $2.1) # PR BenchmarkTools.Trial: 10000 samples with 171 evaluations. Range (min … max): 628.655 ns … 5.420 μs ┊ GC (min … max): 0.00% … 81.78% Time (median): 638.596 ns ┊ GC (median): 0.00% Time (mean ± σ): 666.492 ns ± 216.780 ns ┊ GC (mean ± σ): 1.62% ± 4.45% ▆█▆▅▃▂▁▁ ▁ ▁▁ ▁ ██████████████▇▇▇▇▇▆▆▆▆▅▆▇▇▆▆▆▆▅▅▅▆▆▆▆▇▆▄▄▃▄▃▃▄▃▂▃▃▄▃▄▄▃▃▃▃▄▃ █ 629 ns Histogram: log(frequency) by time 987 ns < Memory estimate: 848 bytes, allocs estimate: 4. julia> @benchmark Chebyshev2($T, $n, $2.1) # master BenchmarkTools.Trial: 10000 samples with 176 evaluations. Range (min … max): 606.818 ns … 4.291 μs ┊ GC (min … max): 0.00% … 80.83% Time (median): 616.477 ns ┊ GC (median): 0.00% Time (mean ± σ): 637.933 ns ± 181.670 ns ┊ GC (mean ± σ): 1.46% ± 4.49% ▄█▇▄▅▂▁ ▁▁ ▂ ████████▇▇▆▆▆▆███▇▇▆▇▆▅▆▁▄▄▅▆▆▆▆▆▅▆▄▄▅▅▅▄▆▆▆▇█▆▅▄▅▄▄▅▄▄▁▅▃▃▅▄ █ 607 ns Histogram: log(frequency) by time 905 ns < Memory estimate: 848 bytes, allocs estimate: 4. julia> @benchmark Elliptic($T, $n, $1, $2) # PR BenchmarkTools.Trial: 10000 samples with 8 evaluations. Range (min … max): 3.362 μs … 127.925 μs ┊ GC (min … max): 0.00% … 93.63% Time (median): 3.500 μs ┊ GC (median): 0.00% Time (mean ± σ): 3.640 μs ± 1.300 μs ┊ GC (mean ± σ): 0.33% ± 0.94% ██▇▁▄▃ ▄▃▁▄▆▇▆▄▃▂▁▁ ▁ ▁ ▂ ██████▆▆██████████████▇█▇▇█▇▇▇▇▇█▇▇▇▇▇▆▄▅▅▆▅▅▄▅▆▄▅▅▃▃▅▄▃▄▄▅ █ 3.36 μs Histogram: log(frequency) by time 5.09 μs < Memory estimate: 1.03 KiB, allocs estimate: 5. julia> @benchmark Elliptic($T, $n, $1, $2) # master BenchmarkTools.Trial: 10000 samples with 8 evaluations. Range (min … max): 3.350 μs … 94.287 μs ┊ GC (min … max): 0.00% … 92.06% Time (median): 3.400 μs ┊ GC (median): 0.00% Time (mean ± σ): 3.502 μs ± 975.978 ns ┊ GC (mean ± σ): 0.25% ± 0.92% ▃▇██▄▃▁ ▂▁ ▂▂▃▃▃ ▃▃▃▂▃ ▂▁ ▂ ███████▆███▅▄█████▇▇██████████▇▇▆▆▇▄▆▄▄▆▄▅▄▄▅▆▅▅▆▄▅▄▅▅▄▅▄▂▄ █ 3.35 μs Histogram: log(frequency) by time 4.3 μs < Memory estimate: 1.03 KiB, allocs estimate: 5. ```
T = Float64, n = 10_000 ```julia julia> T = Float64; n = 10_000; julia> @benchmark Butterworth($T, $n) # PR BenchmarkTools.Trial: 10000 samples with 1 evaluation. Range (min … max): 91.600 μs … 3.293 ms ┊ GC (min … max): 0.00% … 94.99% Time (median): 94.700 μs ┊ GC (median): 0.00% Time (mean ± σ): 112.243 μs ± 82.190 μs ┊ GC (mean ± σ): 3.83% ± 5.13% ▂█▅ ▁▃ ▃▅▄▂▁ ▁▁ ▁ ███▇█▇██▇▅▅▅▅▆▅▄▄▅▅▄▆████████▇██▇▆▆▆▇▆▆▇▆▆▆▆██▆▆▄▅▅▆▅▅▄▄▅▅▄▃ █ 91.6 μs Histogram: log(frequency) by time 187 μs < Memory estimate: 156.41 KiB, allocs estimate: 5. julia> @benchmark Butterworth($T, $n) # master BenchmarkTools.Trial: 10000 samples with 1 evaluation. Range (min … max): 92.300 μs … 3.223 ms ┊ GC (min … max): 0.00% … 94.91% Time (median): 94.700 μs ┊ GC (median): 0.00% Time (mean ± σ): 105.481 μs ± 68.349 μs ┊ GC (mean ± σ): 3.03% ± 4.88% ▅█▃ ▂ ▂▃▂ ▁ █████████▅▆▆▄▅▄▃▅▄▅▄▄▅██████▇▆▇▆▇█▆▆▆▅▆▆▆▆▅▅▄▄▇▅▅▅▅▄▄▄▄▃▄▃▄▄ █ 92.3 μs Histogram: log(frequency) by time 185 μs < Memory estimate: 156.41 KiB, allocs estimate: 5. julia> @benchmark Chebyshev1($T, $n, $2.1) # PR BenchmarkTools.Trial: 10000 samples with 1 evaluation. Range (min … max): 98.300 μs … 5.296 ms ┊ GC (min … max): 0.00% … 95.52% Time (median): 100.600 μs ┊ GC (median): 0.00% Time (mean ± σ): 111.360 μs ± 74.718 μs ┊ GC (mean ± σ): 2.85% ± 4.82% ▆█▁▁▁ ▁ ▃▃▂▁▁ ▁▁ ▁ ████████▆▅▄▆▆▅▅▆▅▄▆███████▇███▆▆▆▆▇▇▆▆▆▆▅▆▅▅▅▅▅▅▅▅▃▁▄▄▄▅▇▆▆▆ █ 98.3 μs Histogram: log(frequency) by time 205 μs < Memory estimate: 156.42 KiB, allocs estimate: 6. julia> @benchmark Chebyshev1($T, $n, $2.1) # master BenchmarkTools.Trial: 10000 samples with 1 evaluation. Range (min … max): 98.200 μs … 3.129 ms ┊ GC (min … max): 0.00% … 94.34% Time (median): 100.900 μs ┊ GC (median): 0.00% Time (mean ± σ): 111.306 μs ± 67.067 μs ┊ GC (mean ± σ): 2.83% ± 4.84% ▂█▇ ▁ ▁▁ ▄▃▂ ▁ ██████████▇▅▆▅▅▅▄▅▄▅▅▃▄▄█████▇▇▇▇▇▆▆▆▆▅▅▄▅▄▄▆▆▆▅▃▄▄▅▅▇▇▄▅▄▃▅ █ 98.2 μs Histogram: log(frequency) by time 181 μs < Memory estimate: 156.42 KiB, allocs estimate: 6. julia> @benchmark Chebyshev2($T, $n, $2.1) # PR BenchmarkTools.Trial: 10000 samples with 1 evaluation. Range (min … max): 183.000 μs … 3.321 ms ┊ GC (min … max): 0.00% … 92.13% Time (median): 187.000 μs ┊ GC (median): 0.00% Time (mean ± σ): 210.085 μs ± 97.362 μs ┊ GC (mean ± σ): 3.16% ± 6.36% ▅█▂▂▁ ▂ ▄▄▁ ▁ █████▇██▇▇▆▆▇▇▆▆▅▄▄▃▅▅▇█████▇▇▆▇▇██▆▆▆▆▆▆▆▅▅▅▆▅▆▆▅▅▅▄▃▄▃▄▄▃▄ █ 183 μs Histogram: log(frequency) by time 361 μs < Memory estimate: 312.66 KiB, allocs estimate: 7. julia> @benchmark Chebyshev2($T, $n, $2.1) # master BenchmarkTools.Trial: 10000 samples with 1 evaluation. Range (min … max): 180.600 μs … 3.430 ms ┊ GC (min … max): 0.00% … 91.64% Time (median): 187.900 μs ┊ GC (median): 0.00% Time (mean ± σ): 218.443 μs ± 108.970 μs ┊ GC (mean ± σ): 3.50% ± 6.58% █▇▃▂ ▂▃▁ ▄▅▃▂ ▁▁ ▁ ▆█████████▇▇▆▇▆▇▅▆▂▆████████████▆▇█▇▇▆▇▆▄▆█▆▅▅▃▄▄▅▄▅▄▃▄▃▃▃▄▄▃ █ 181 μs Histogram: log(frequency) by time 393 μs < Memory estimate: 312.66 KiB, allocs estimate: 7. julia> @benchmark Elliptic($T, $n, $1, $2) # PR BenchmarkTools.Trial: 3613 samples with 1 evaluation. Range (min … max): 1.319 ms … 4.440 ms ┊ GC (min … max): 0.00% … 67.69% Time (median): 1.335 ms ┊ GC (median): 0.00% Time (mean ± σ): 1.381 ms ± 132.695 μs ┊ GC (mean ± σ): 0.54% ± 3.49% ▃▁█▄▂▁▁ ▄▃ ▁ ▂ ████████████▇██████▇▇▇▇█▇██▇▆▇▆▄▇▅▅▆▅▇█▆▄▄▆▅▅▄▅▂▅▂▃▂▂▄▃▂▂▂▃ █ 1.32 ms Histogram: log(frequency) by time 1.68 ms < Memory estimate: 312.86 KiB, allocs estimate: 8. julia> @benchmark Elliptic($T, $n, $1, $2) # master BenchmarkTools.Trial: 3644 samples with 1 evaluation. Range (min … max): 1.330 ms … 4.476 ms ┊ GC (min … max): 0.00% … 67.69% Time (median): 1.336 ms ┊ GC (median): 0.00% Time (mean ± σ): 1.369 ms ± 117.290 μs ┊ GC (mean ± σ): 0.56% ± 3.55% ▇█▄▂▂▂▁▁▁ ▁ ▄▄▂▂▁ ▁ ▁ █████████████▇█████████▆█▇█▇▇▅▇██▇▇▅▄▄▅▇▁▅▅▅▃▄▄▅▄▄▅▁▁▄▃▄▃▁▃ █ 1.33 ms Histogram: log(frequency) by time 1.6 ms < Memory estimate: 312.86 KiB, allocs estimate: 8. ```
digitalfilter, Butterworth(3) ```julia julia> hp = Highpass(0.5); bs = Bandstop(0.35, 0.4); bp = Bandpass(0.25, 0.75); julia> butt = Butterworth(3); julia> @benchmark digitalfilter($hp, $butt) # PR BenchmarkTools.Trial: 10000 samples with 451 evaluations. Range (min … max): 230.599 ns … 1.262 μs ┊ GC (min … max): 0.00% … 70.83% Time (median): 234.812 ns ┊ GC (median): 0.00% Time (mean ± σ): 247.525 ns ± 59.115 ns ┊ GC (mean ± σ): 1.25% ± 4.70% ▇█▅▃▂ ▃▂ ▁ ██████▇█████▇▇▇▇▇▇▆██▇▆▆▅▅▅▅▅▆▅▆▅▅▅▅▅▄▅▄▄▄▅▅▄▃▄▄▄▂▂▂▄▆███▇▆▅ █ 231 ns Histogram: log(frequency) by time 405 ns < Memory estimate: 448 bytes, allocs estimate: 4. julia> @benchmark digitalfilter($hp, $butt) # master BenchmarkTools.Trial: 10000 samples with 425 evaluations. Range (min … max): 236.235 ns … 1.230 μs ┊ GC (min … max): 0.00% … 66.54% Time (median): 241.176 ns ┊ GC (median): 0.00% Time (mean ± σ): 251.295 ns ± 54.503 ns ┊ GC (mean ± σ): 1.12% ± 4.49% ▆█▆▅▄▁ ▁▁ ▁ ███████▇▇██▆▆▆▆▅▆▅▅▆▆▆▇▆▅▄▅▄▄▅▅▄▄▄▃▅▃▃▃▄▃▅▄▃▇█▇▆▆▅▄▅▅▆▅▅▅▄▄▅ █ 236 ns Histogram: log(frequency) by time 406 ns < Memory estimate: 448 bytes, allocs estimate: 4. julia> @benchmark digitalfilter($bs, $butt) # PR BenchmarkTools.Trial: 10000 samples with 200 evaluations. Range (min … max): 408.500 ns … 2.172 μs ┊ GC (min … max): 0.00% … 74.48% Time (median): 417.500 ns ┊ GC (median): 0.00% Time (mean ± σ): 435.752 ns ± 90.772 ns ┊ GC (mean ± σ): 0.83% ± 3.73% ▁█▆▃ ▂ ▁ ▁ ████▇█▇▇▆███▆▇▆▄▆▅▅▅▅▅▅▄▄▅▆▆▅▅▆▅▃▄▄▅▆███▆▅▅▅▅▅▄▄▄▆▆▅▅▄▄▄▄▅▄▅ █ 408 ns Histogram: log(frequency) by time 695 ns < Memory estimate: 640 bytes, allocs estimate: 4. julia> @benchmark digitalfilter($bs, $butt) # master BenchmarkTools.Trial: 10000 samples with 199 evaluations. Range (min … max): 416.080 ns … 2.311 μs ┊ GC (min … max): 0.00% … 70.99% Time (median): 426.131 ns ┊ GC (median): 0.00% Time (mean ± σ): 444.156 ns ± 91.548 ns ┊ GC (mean ± σ): 0.82% ± 3.73% ██▅▁▁▁ ▁ ▁ ▁ ████████▆▆▆██▇▇▆▆▆▆▆▇▆▆▄▅▅▅▅▇▆▆▅▅▄▅▃▅▄▅▅▇███▇▆▅▄▂▄▄▅▃▇▆▆▄▃▃▅ █ 416 ns Histogram: log(frequency) by time 681 ns < Memory estimate: 640 bytes, allocs estimate: 4. julia> @benchmark digitalfilter($bp, $butt) # PR BenchmarkTools.Trial: 10000 samples with 222 evaluations. Range (min … max): 334.685 ns … 2.760 μs ┊ GC (min … max): 0.00% … 78.93% Time (median): 340.090 ns ┊ GC (median): 0.00% Time (mean ± σ): 355.924 ns ± 101.835 ns ┊ GC (mean ± σ): 1.31% ± 4.14% ▆█▆▃▁▁▂▁ ▂▂▁ ▁ ███████████████▇▇▇▆▆▆▆▆▆▆▆▆▆▆▅▅▅▅▅▇▇▇▆▆▅▅▄▅▄▄▅▂▄▆▇▇▇▅▄▅▄▄▄▄▂▄ █ 335 ns Histogram: log(frequency) by time 536 ns < Memory estimate: 592 bytes, allocs estimate: 4. julia> @benchmark digitalfilter($bp, $butt) # master BenchmarkTools.Trial: 10000 samples with 217 evaluations. Range (min … max): 336.866 ns … 2.486 μs ┊ GC (min … max): 0.00% … 60.49% Time (median): 345.161 ns ┊ GC (median): 0.00% Time (mean ± σ): 358.933 ns ± 97.166 ns ┊ GC (mean ± σ): 1.19% ± 4.06% ▃██▅▂▂▂ ▁ ▁ ███████▇▆▆██▇▇▆▆▆▅▅▆▆▅▅▅▅▅▅▅▅▅▅▅▅▄▄▅▄▄▄▅▇█▆▆▄▅▄▄▅▅▅▅▅▅▅▄▂▄▆▅ █ 337 ns Histogram: log(frequency) by time 576 ns < Memory estimate: 592 bytes, allocs estimate: 4. ```
digitalfilter, Butterworth(10_000) ```julia julia> butt = Butterworth(10_000); julia> @benchmark digitalfilter($hp, $butt) # PR BenchmarkTools.Trial: 10000 samples with 1 evaluation. Range (min … max): 412.000 μs … 4.995 ms ┊ GC (min … max): 0.00% … 86.93% Time (median): 419.000 μs ┊ GC (median): 0.00% Time (mean ± σ): 493.169 μs ± 195.996 μs ┊ GC (mean ± σ): 3.50% ± 7.83% █▃▃▁▁ ▁▆▄▄▃▂▁▁ ▁ ██████▇██████████▇▇▆▆▅▅▄▅▄▄▄▁▁▁▁▁▁▁▃▁▄▁▁▄▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▃▅▆▆▆ █ 412 μs Histogram: log(frequency) by time 1.45 ms < Memory estimate: 625.19 KiB, allocs estimate: 8. julia> @benchmark digitalfilter($hp, $butt) # master BenchmarkTools.Trial: 10000 samples with 1 evaluation. Range (min … max): 418.300 μs … 4.185 ms ┊ GC (min … max): 0.00% … 82.11% Time (median): 430.400 μs ┊ GC (median): 0.00% Time (mean ± σ): 476.807 μs ± 156.636 μs ┊ GC (mean ± σ): 2.67% ± 6.93% █▆▃▂▂▁▁ ▃▄▂▂▂▁ ▁ ███████████████████▇▆▇▆▇▅▅▅▆▆▆▅▄▁▃▄▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▃▄▁▅▃▆▆▆ █ 418 μs Histogram: log(frequency) by time 1.23 ms < Memory estimate: 625.19 KiB, allocs estimate: 8. julia> @benchmark digitalfilter($bs, $butt) # PR BenchmarkTools.Trial: 4535 samples with 1 evaluation. Range (min … max): 995.800 μs … 4.446 ms ┊ GC (min … max): 0.00% … 65.81% Time (median): 1.006 ms ┊ GC (median): 0.00% Time (mean ± σ): 1.097 ms ± 227.020 μs ┊ GC (mean ± σ): 1.97% ± 5.96% █▄▃▂▂▂▁ ▁▃▂▂ ▁ ████████▇▅▆▆▄▄▅██████▇▇██▇▆▆▇▆▅▅▅▅▄▄▅▄▄▃▃▁▄▁▁▁▃▄▁▃▃▄▁▅▄▆█▆▆▆▅ █ 996 μs Histogram: log(frequency) by time 2.05 ms < Memory estimate: 1.22 MiB, allocs estimate: 8. julia> @benchmark digitalfilter($bs, $butt) # master BenchmarkTools.Trial: 4463 samples with 1 evaluation. Range (min … max): 990.200 μs … 4.490 ms ┊ GC (min … max): 0.00% … 67.51% Time (median): 1.013 ms ┊ GC (median): 0.00% Time (mean ± σ): 1.118 ms ± 250.837 μs ┊ GC (mean ± σ): 2.27% ± 6.52% ██▄▃▃▂▁ ▃▅▃▃▁▁▁ ▁ ▁ ██████████▇▆▄▅██████████▇█▇▇▇▇▅▅▄▅▄▅▃▃▃▃▁▁▅▁▃▃▁▃▁▃▁▃▁▄▄▅▃▇██▇ █ 990 μs Histogram: log(frequency) by time 2.13 ms < Memory estimate: 1.22 MiB, allocs estimate: 8. julia> @benchmark digitalfilter($bp, $butt) # PR BenchmarkTools.Trial: 6012 samples with 1 evaluation. Range (min … max): 753.900 μs … 4.050 ms ┊ GC (min … max): 0.00% … 73.92% Time (median): 761.600 μs ┊ GC (median): 0.00% Time (mean ± σ): 827.660 μs ± 186.852 μs ┊ GC (mean ± σ): 1.99% ± 5.90% █▃▃▂▁▂▁ ▃▁▃▁▁ ▁ ████████▇█▇▇▆▅▅██████▇▇▇▇▇█▇▅▇▅▅▄▄▄▃▁▅▄▅▄▃▅▁▃▁▃▃▃▁▃▄▄▃▅▅▅▇█▆▅ █ 754 μs Histogram: log(frequency) by time 1.61 ms < Memory estimate: 1.07 MiB, allocs estimate: 8. julia> @benchmark digitalfilter($bp, $butt) # master BenchmarkTools.Trial: 6154 samples with 1 evaluation. Range (min … max): 756.500 μs … 4.047 ms ┊ GC (min … max): 0.00% … 73.83% Time (median): 776.400 μs ┊ GC (median): 0.00% Time (mean ± σ): 809.302 μs ± 132.723 μs ┊ GC (mean ± σ): 1.50% ± 5.52% ▃▇█▆▃▃▂▁▁▁▁ ▁▁▁ ▁ ███████████████▇▇▇▆▇▆▅▆▅▅▄▅▄▅▃▁▃▅███▇▇█▇▆▇▇▅▄▄▅▆▅▄▄▅▆▇▇▇▇▇▇▇▇ █ 756 μs Histogram: log(frequency) by time 1.2 ms < Memory estimate: 1.07 MiB, allocs estimate: 8. ```
martinholters commented 5 months ago

FWIW, the @inbounds annotations here do look correct to me, but if the benchmarks don't support that there is a performance improvement, I don't think we should be doing this. It will increase the mental load and the risk of introducing bugs in the future.

wheeheee commented 5 months ago

Actually, I think there's still a potential missed optimisation here. It would probably be fine to not zero out the Vectors in Butterworth etc., which could be helpful to @Jafagervik

Jafagervik commented 5 months ago

Actually, I think there's still a potential missed optimisation here. It would probably be fine to not zero out the Vectors in Butterworth etc., which could be helpful to @Jafagervik

This would help me a lot at least. I'm on julia 1.10.1 right now and I use Butterworth quite a lot

wheeheee commented 5 months ago

I also experimented with using sincospi, inlining cospi, sincospi, cde, and sne, and managed to squeeze out, for larger n, a nearly 100% increase in performance for Butterworth, Chebyshev1, and Chebyshev2, and about 50% for Elliptic, with more modest improvements for small n. However, as these functions are nearly always called with small numbers in the tests, I'm not sure how valuable these are.

julia> T = Float64; n = 10_000;

julia> @benchmark Butterworth($T, $n)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  42.800 μs …  2.819 ms  ┊ GC (min … max): 0.00% … 96.49%
 Time  (median):     43.600 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   58.370 μs ± 67.545 μs  ┊ GC (mean ± σ):  5.95% ±  5.27%

  █▃  ▂                         ▁    ▃▄▃▂▁▁                   ▁
  ██▆███▇▆▇█▇▇▅▄▆▅▆▆▅▄▄▅▄▅▄▅▃▃▄▇██▆▆▆█████████▇▇▇▆▅▆▅▆▄▆▆▆▅▄▄ █
  42.8 μs      Histogram: log(frequency) by time       110 μs <

 Memory estimate: 156.41 KiB, allocs estimate: 5.

julia> @benchmark Chebyshev1($T, $n, $2.1)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  50.200 μs …  2.944 ms  ┊ GC (min … max): 0.00% … 96.18%
 Time  (median):     51.100 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   58.743 μs ± 50.223 μs  ┊ GC (mean ± σ):  4.01% ±  5.00%

  █▇  ▁▂▄                           ▁▂▁      ▁                ▁
  ███▇████▇▇▇▇▇▆▆▄▅▆▆▅▅▆▄▃▃▃▆▅▆▃▅▄▂▇████▇▇▇▇████▇▇▇▇▆▆▅▆▅▅▄▃▄ █
  50.2 μs      Histogram: log(frequency) by time       108 μs <

 Memory estimate: 156.42 KiB, allocs estimate: 6.

julia> @benchmark Chebyshev2($T, $n, $2.1)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  103.800 μs …  3.278 ms  ┊ GC (min … max): 0.00% … 93.43%
 Time  (median):     105.400 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   124.029 μs ± 85.721 μs  ┊ GC (mean ± σ):  4.34% ±  6.55%

  █▃▂▂▂▁                   ▁   ▂▃▂▁                            ▁
  ████████▇█▇▇▆▆▆▆▇▇▆▆▅▅▅▆███▇▇█████▇▇▆▆▇▆▆▆▆▆▆▅▅▅▅▅▅▄▄▃▄▁▄▁▄▅ █
  104 μs        Histogram: log(frequency) by time       268 μs <

 Memory estimate: 312.66 KiB, allocs estimate: 7.

julia> @benchmark Elliptic($T, $n, $1, $2)
BenchmarkTools.Trial: 5771 samples with 1 evaluation.
 Range (min … max):  841.200 μs …  3.936 ms  ┊ GC (min … max): 0.00% … 75.26%
 Time  (median):     845.200 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   863.471 μs ± 83.093 μs  ┊ GC (mean ± σ):  0.60% ±  3.62%

  █▆▃▃▂▁▁    ▁    ▃▂                                           ▁
  █████████████▇▇▇████▇█▇▇▇▅▇▆▆▆▆▆▅▅▄▄▅▃▄▅▅▁▆▄▄▅▄▁▃▃▃▄▁▁▁▃▁▁▁▃ █
  841 μs        Histogram: log(frequency) by time       1.1 ms <

 Memory estimate: 312.86 KiB, allocs estimate: 8.

I pushed it to the cleanup PR, as it looks like a minor optimization. If it should be in its own PR, let me know.

ViralBShah commented 4 months ago

Close (if this is included in the cleanups)?

wheeheee commented 4 months ago

Yeah, if @Jafagervik comes back with benchmarks that show @inbounds is needed, it can be reopened.