SymbolicML / DynamicExpressions.jl

Ridiculously fast symbolic expressions
https://symbolicml.org/DynamicExpressions.jl/dev
Apache License 2.0
90 stars 11 forks source link

Enforce `Vararg` specialization #60

Closed MilesCranmer closed 1 week ago

MilesCranmer commented 5 months ago

I didn't realize this but apparently Julia doesn't actually specialize functions based on variable numbers of arguments.

There's a way to get around this as described here: https://docs.julialang.org/en/v1/manual/performance-tips/#Be-aware-of-when-Julia-avoids-specializing.

This PR implements this forced specialization which should hopefully give some speedups.

coveralls commented 5 months ago

Pull Request Test Coverage Report for Build 7394655453


Totals Coverage Status
Change from base Build 7337169551: 0.0%
Covered Lines: 1269
Relevant Lines: 1346

💛 - Coveralls
github-actions[bot] commented 5 months ago

Benchmark Results

master a16288c604ab94... t[master]/t[a16288c604ab94...]
eval/ComplexF32/evaluation 7.38 ± 0.46 ms 7.34 ± 0.46 ms 1.01
eval/ComplexF64/evaluation 9.65 ± 0.7 ms 9.58 ± 0.73 ms 1.01
eval/Float32/derivative 10.5 ± 1.4 ms 10.8 ± 1.8 ms 0.977
eval/Float32/derivative_turbo 11.9 ± 1.4 ms 12.2 ± 1.7 ms 0.981
eval/Float32/evaluation 2.66 ± 0.22 ms 2.69 ± 0.23 ms 0.991
eval/Float32/evaluation_turbo 0.653 ± 0.034 ms 0.62 ± 0.027 ms 1.05
eval/Float64/derivative 13.8 ± 0.98 ms 13.7 ± 0.57 ms 1
eval/Float64/derivative_turbo 14.4 ± 0.67 ms 14.3 ± 0.65 ms 1
eval/Float64/evaluation 2.81 ± 0.24 ms 2.83 ± 0.24 ms 0.992
eval/Float64/evaluation_turbo 1.12 ± 0.061 ms 1.11 ± 0.06 ms 1.01
utils/combine_operators/break_sharing 0.0405 ± 0.0027 ms 0.0409 ± 0.0027 ms 0.989
utils/convert/break_sharing 28.5 ± 0.61 μs 28.1 ± 0.61 μs 1.01
utils/convert/preserve_sharing 0.13 ± 0.0035 ms 0.128 ± 0.0028 ms 1.02
utils/copy/break_sharing 29.8 ± 1.5 μs 28.8 ± 0.62 μs 1.04
utils/copy/preserve_sharing 0.129 ± 0.0032 ms 0.129 ± 0.0027 ms 0.999
utils/count_constants/break_sharing 10.3 ± 0.22 μs 10.3 ± 0.16 μs 1
utils/count_constants/preserve_sharing 0.114 ± 0.0026 ms 0.112 ± 0.0025 ms 1.02
utils/count_depth/break_sharing 17.4 ± 0.39 μs 17.4 ± 0.37 μs 1
utils/count_nodes/break_sharing 10.2 ± 0.16 μs 10.2 ± 0.16 μs 1
utils/count_nodes/preserve_sharing 0.118 ± 0.003 ms 0.114 ± 0.0026 ms 1.03
utils/get_set_constants!/break_sharing 0.0533 ± 0.00083 ms 0.0529 ± 0.00077 ms 1.01
utils/get_set_constants!/preserve_sharing 0.324 ± 0.0072 ms 0.32 ± 0.0053 ms 1.01
utils/has_constants/break_sharing 4.52 ± 0.21 μs 4.49 ± 0.21 μs 1.01
utils/has_operators/break_sharing 1.93 ± 0.018 μs 1.94 ± 0.016 μs 0.998
utils/hash/break_sharing 30.1 ± 0.46 μs 30.2 ± 0.42 μs 0.996
utils/hash/preserve_sharing 0.133 ± 0.0029 ms 0.131 ± 0.0024 ms 1.01
utils/index_constants/break_sharing 28.2 ± 0.64 μs 27.7 ± 0.65 μs 1.02
utils/index_constants/preserve_sharing 0.127 ± 0.0031 ms 0.133 ± 0.0025 ms 0.955
utils/is_constant/break_sharing 4.35 ± 0.21 μs 4.36 ± 0.21 μs 0.997
utils/simplify_tree/break_sharing 0.248 ± 0.02 ms 0.17 ± 0.015 ms 1.46
utils/simplify_tree/preserve_sharing 0.377 ± 0.023 ms 0.286 ± 0.016 ms 1.32
utils/string_tree/break_sharing 0.579 ± 0.015 ms 0.569 ± 0.016 ms 1.02
utils/string_tree/preserve_sharing 0.721 ± 0.022 ms 0.706 ± 0.017 ms 1.02
time_to_load 0.676 ± 0.014 s 0.681 ± 0.0078 s 0.993