Closed MilesCranmer closed 4 months ago
master | 15844adc4046fa... | t[master]/t[15844adc4046fa...] | |
---|---|---|---|
eval/ComplexF32/evaluation | 7.42 ± 0.5 ms | 7.46 ± 0.47 ms | 0.995 |
eval/ComplexF64/evaluation | 9.68 ± 0.73 ms | 9.64 ± 0.69 ms | 1 |
eval/Float32/derivative | 10.8 ± 1.6 ms | 10.8 ± 1.6 ms | 1 |
eval/Float32/derivative_turbo | 10.9 ± 1.8 ms | 10.8 ± 1.6 ms | 1.01 |
eval/Float32/evaluation | 2.72 ± 0.22 ms | 2.74 ± 0.22 ms | 0.992 |
eval/Float32/evaluation_turbo | 0.713 ± 0.033 ms | 0.702 ± 0.03 ms | 1.02 |
eval/Float64/derivative | 14.4 ± 0.78 ms | 13.7 ± 0.57 ms | 1.05 |
eval/Float64/derivative_turbo | 14.8 ± 0.68 ms | 13.6 ± 0.59 ms | 1.09 |
eval/Float64/evaluation | 2.93 ± 0.24 ms | 2.89 ± 0.23 ms | 1.01 |
eval/Float64/evaluation_turbo | 1.2 ± 0.064 ms | 1.18 ± 0.058 ms | 1.02 |
utils/combine_operators/break_sharing | 0.0493 ± 0.0027 ms | 0.0498 ± 0.0029 ms | 0.988 |
utils/convert/break_sharing | 28.2 ± 1 μs | 28 ± 0.9 μs | 1 |
utils/convert/preserve_sharing | 0.13 ± 0.0044 ms | 0.127 ± 0.0029 ms | 1.02 |
utils/copy/break_sharing | 28.9 ± 1.1 μs | 28.5 ± 0.88 μs | 1.01 |
utils/copy/preserve_sharing | 0.128 ± 0.0034 ms | 0.127 ± 0.003 ms | 1.01 |
utils/count_constants/break_sharing | 10.9 ± 0.16 μs | 10.5 ± 0.15 μs | 1.03 |
utils/count_constants/preserve_sharing | 0.114 ± 0.003 ms | 0.112 ± 0.0023 ms | 1.02 |
utils/count_depth/break_sharing | 17.3 ± 0.38 μs | 17.9 ± 0.39 μs | 0.97 |
utils/count_nodes/break_sharing | 10.2 ± 0.15 μs | 10.2 ± 0.15 μs | 1 |
utils/count_nodes/preserve_sharing | 0.116 ± 0.0029 ms | 0.116 ± 0.0022 ms | 1.01 |
utils/get_set_constants!/break_sharing | 0.0524 ± 0.0008 ms | 0.0526 ± 0.0008 ms | 0.997 |
utils/get_set_constants!/preserve_sharing | 0.321 ± 0.0067 ms | 0.324 ± 0.0059 ms | 0.992 |
utils/has_constants/break_sharing | 4.42 ± 0.22 μs | 4.3 ± 0.22 μs | 1.03 |
utils/has_operators/break_sharing | 1.77 ± 0.022 μs | 1.77 ± 0.023 μs | 1 |
utils/hash/break_sharing | 30 ± 0.44 μs | 30 ± 0.43 μs | 1 |
utils/hash/preserve_sharing | 0.133 ± 0.0025 ms | 0.132 ± 0.0024 ms | 1.01 |
utils/index_constants/break_sharing | 27.9 ± 0.79 μs | 27.3 ± 0.71 μs | 1.02 |
utils/index_constants/preserve_sharing | 0.129 ± 0.0032 ms | 0.128 ± 0.0029 ms | 1.01 |
utils/is_constant/break_sharing | 4.88 ± 0.23 μs | 4.76 ± 0.23 μs | 1.02 |
utils/simplify_tree/break_sharing | 0.18 ± 0.016 ms | 0.173 ± 0.015 ms | 1.04 |
utils/simplify_tree/preserve_sharing | 0.295 ± 0.018 ms | 0.368 ± 0.017 ms | 0.801 |
utils/string_tree/break_sharing | 0.507 ± 0.017 ms | 0.495 ± 0.012 ms | 1.02 |
utils/string_tree/preserve_sharing | 0.661 ± 0.02 ms | 0.642 ± 0.017 ms | 1.03 |
eval/Float32/evaluation_turbo_bumper | 0.529 ± 0.013 ms | ||
eval/Float64/evaluation_bumper | 1.21 ± 0.044 ms | ||
eval/Float64/evaluation_turbo_bumper | 1.21 ± 0.041 ms | ||
eval/Float32/evaluation_bumper | 0.529 ± 0.012 ms | ||
time_to_load | 0.668 ± 0.017 s | 0.175 ± 0.0016 s | 3.81 |
Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | ||
---|---|---|---|---|---|
src/EquationUtils.jl | 16 | 17 | 94.12% | ||
ext/DynamicExpressionsLoopVectorizationExt.jl | 139 | 144 | 96.53% | ||
<!-- | Total: | 232 | 238 | 97.48% | --> |
Totals | |
---|---|
Change from base Build 7688222919: | 0.2% |
Covered Lines: | 1579 |
Relevant Lines: | 1669 |
The expression evaluation is pretty heavy on the garbage collection so I am exploring Bumper.jl. Initial tests show that you do get a speedup from this, even for simpler equations.
It is possible for us to figure out total number of allocations in advance, so I wonder how much we can gain from using a fixed-size buffer here.
TODO: