Closed Jafagervik closed 4 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Comparison is base (
4213e9a
) 97.56% compared to head (a318e1d
) 97.56%.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Is there a meaningful performance benefit?
Yes. I can try to profile where the compiler turns of blundary checking, but this macro is indeed powerful, and since I'm already using my own version of DSP.jl for my master thesis on ai and signal processing in julia, I thought I'd might as well contribute! I will do some more benches as soon as possible, but I'm quite busy these days
Thank you for the contribution, and good to know that you are already using this in your code.
A quick benchmark in this PR will help serve as documentation for the change. Generally, the preference is not to add @inbounds
because people have used it incorrectly and aggressively in the past. But if the benefit is clear, it is definitely worthwhile to use it.
@wheeheee The windows failures seem codecov upload problems. I think the change to fail on codecov errors is perhaps aggressive. Would reverting it fix the error here (which is not in the package at all)?
I think it's fine actually; I always expand the failed checks and prefer false positives to silent failures. Didn't comment before since I wanted to benchmark first, but that's for tomorrow...
It's weird, though, that Codecov says their server errors are fixed.
Did some benchmarks for Butterworth
, Chebyshev1
, Chebyshev2
, and Elliptic
, and digitalfilter
.
For these functions, I don't see any significant performance differences on Julia 1.10, and similarly on 1.6.
FWIW, the @inbounds
annotations here do look correct to me, but if the benchmarks don't support that there is a performance improvement, I don't think we should be doing this. It will increase the mental load and the risk of introducing bugs in the future.
Actually, I think there's still a potential missed optimisation here. It would probably be fine to not zero out the Vector
s in Butterworth etc., which could be helpful to @Jafagervik
Actually, I think there's still a potential missed optimisation here. It would probably be fine to not zero out the
Vector
s in Butterworth etc., which could be helpful to @Jafagervik
This would help me a lot at least. I'm on julia 1.10.1 right now and I use Butterworth quite a lot
I also experimented with using sincospi
, inlining cospi
, sincospi
, cde
, and sne
, and managed to squeeze out, for larger n
, a nearly 100% increase in performance for Butterworth
, Chebyshev1
, and Chebyshev2
, and about 50% for Elliptic
, with more modest improvements for small n
.
However, as these functions are nearly always called with small numbers in the tests, I'm not sure how valuable these are.
julia> T = Float64; n = 10_000;
julia> @benchmark Butterworth($T, $n)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 42.800 μs … 2.819 ms ┊ GC (min … max): 0.00% … 96.49%
Time (median): 43.600 μs ┊ GC (median): 0.00%
Time (mean ± σ): 58.370 μs ± 67.545 μs ┊ GC (mean ± σ): 5.95% ± 5.27%
█▃ ▂ ▁ ▃▄▃▂▁▁ ▁
██▆███▇▆▇█▇▇▅▄▆▅▆▆▅▄▄▅▄▅▄▅▃▃▄▇██▆▆▆█████████▇▇▇▆▅▆▅▆▄▆▆▆▅▄▄ █
42.8 μs Histogram: log(frequency) by time 110 μs <
Memory estimate: 156.41 KiB, allocs estimate: 5.
julia> @benchmark Chebyshev1($T, $n, $2.1)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 50.200 μs … 2.944 ms ┊ GC (min … max): 0.00% … 96.18%
Time (median): 51.100 μs ┊ GC (median): 0.00%
Time (mean ± σ): 58.743 μs ± 50.223 μs ┊ GC (mean ± σ): 4.01% ± 5.00%
█▇ ▁▂▄ ▁▂▁ ▁ ▁
███▇████▇▇▇▇▇▆▆▄▅▆▆▅▅▆▄▃▃▃▆▅▆▃▅▄▂▇████▇▇▇▇████▇▇▇▇▆▆▅▆▅▅▄▃▄ █
50.2 μs Histogram: log(frequency) by time 108 μs <
Memory estimate: 156.42 KiB, allocs estimate: 6.
julia> @benchmark Chebyshev2($T, $n, $2.1)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 103.800 μs … 3.278 ms ┊ GC (min … max): 0.00% … 93.43%
Time (median): 105.400 μs ┊ GC (median): 0.00%
Time (mean ± σ): 124.029 μs ± 85.721 μs ┊ GC (mean ± σ): 4.34% ± 6.55%
█▃▂▂▂▁ ▁ ▂▃▂▁ ▁
████████▇█▇▇▆▆▆▆▇▇▆▆▅▅▅▆███▇▇█████▇▇▆▆▇▆▆▆▆▆▆▅▅▅▅▅▅▄▄▃▄▁▄▁▄▅ █
104 μs Histogram: log(frequency) by time 268 μs <
Memory estimate: 312.66 KiB, allocs estimate: 7.
julia> @benchmark Elliptic($T, $n, $1, $2)
BenchmarkTools.Trial: 5771 samples with 1 evaluation.
Range (min … max): 841.200 μs … 3.936 ms ┊ GC (min … max): 0.00% … 75.26%
Time (median): 845.200 μs ┊ GC (median): 0.00%
Time (mean ± σ): 863.471 μs ± 83.093 μs ┊ GC (mean ± σ): 0.60% ± 3.62%
█▆▃▃▂▁▁ ▁ ▃▂ ▁
█████████████▇▇▇████▇█▇▇▇▅▇▆▆▆▆▆▅▅▄▄▅▃▄▅▅▁▆▄▄▅▄▁▃▃▃▄▁▁▁▃▁▁▁▃ █
841 μs Histogram: log(frequency) by time 1.1 ms <
Memory estimate: 312.86 KiB, allocs estimate: 8.
I pushed it to the cleanup PR, as it looks like a minor optimization. If it should be in its own PR, let me know.
Close (if this is included in the cleanups)?
Yeah, if @Jafagervik comes back with benchmarks that show @inbounds
is needed, it can be reopened.
Avoid unnecessary boundary checking
Fixes #540