SymbolicML / DynamicExpressions.jl

Ridiculously fast symbolic expressions
https://symbolicml.org/DynamicExpressions.jl/dev
Apache License 2.0
90 stars 11 forks source link

Faster NaN checks #34

Closed MilesCranmer closed 1 month ago

MilesCranmer commented 1 year ago

Thanks to @mikmoore (https://discourse.julialang.org/t/fastest-way-to-check-for-inf-or-nan-in-an-array/76954/33?u=milescranmer)

coveralls commented 1 year ago

Pull Request Test Coverage Report for Build 5032237326

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details


Files with Coverage Reduction New Missed Lines %
src/Utils.jl 1 75.51%
<!-- Total: 1 -->
Totals Coverage Status
Change from base Build 4958507899: 5.6%
Covered Lines: 963
Relevant Lines: 1024

💛 - Coveralls
github-actions[bot] commented 1 year ago

Benchmark Results

master 643e7fdc38fff3... t[master]/t[643e7fdc38fff3...]
eval/derivative/Float32/100/standard 2.35 ± 0.078 ms 2.35 ± 0.2 ms 0.998
eval/derivative/Float32/100/turbo 2.47 ± 0.23 ms 2.28 ± 0.19 ms 1.08
eval/derivative/Float32/1000/standard 17.2 ± 0.8 ms 16.9 ± 1.1 ms 1.02
eval/derivative/Float32/1000/turbo 18.7 ± 1.4 ms 16.7 ± 0.93 ms 1.12
eval/derivative/Float32/10000/standard 0.163 ± 0.0048 s 0.163 ± 0.0053 s 1
eval/derivative/Float32/10000/turbo 0.174 ± 0.0055 s 0.163 ± 0.0047 s 1.07
eval/evaluation/ComplexF32/1000/standard 10.1 ± 0.72 ms 9.77 ± 0.82 ms 1.04
eval/evaluation/ComplexF64/1000/standard 12.4 ± 0.9 ms 12.6 ± 0.92 ms 0.984
eval/evaluation/Float32/100/standard 0.539 ± 0.041 ms 0.539 ± 0.038 ms 0.999
eval/evaluation/Float32/100/turbo 0.31 ± 0.046 ms 0.333 ± 0.047 ms 0.929
eval/evaluation/Float32/1000/standard 3.46 ± 0.26 ms 3.64 ± 0.36 ms 0.951
eval/evaluation/Float32/1000/turbo 0.885 ± 0.091 ms 0.844 ± 0.073 ms 1.05
eval/evaluation/Float32/10000/standard 31.4 ± 2.7 ms 0.0331 ± 0.0031 s 0.95
eval/evaluation/Float32/10000/turbo 5.63 ± 0.46 ms 5.12 ± 0.37 ms 1.1
eval/evaluation/Float64/1000/standard 4.23 ± 0.32 ms 4.4 ± 0.42 ms 0.962
eval/evaluation/Float64/1000/turbo 1.42 ± 0.12 ms 1.32 ± 0.11 ms 1.07
time_to_load 14.4 ± 0.018 s 14.6 ± 0.16 s 0.989
utils/extra/is_bad_array_x16/50 0.283 ± 0.0066 μs 0.13 ± 0.016 μs 2.18
utils/extra/is_bad_array_x16/500 1.11 ± 0.02 μs 0.745 ± 0.085 μs 1.49
utils/extra/is_bad_array_x16/5000 18.6 ± 2.7 μs 6.69 ± 0.96 μs 2.78
utils/trees/combine_operators/break_sharing 0.0619 ± 0.0047 ms 0.0616 ± 0.0048 ms 1
utils/trees/convert/break_sharing 0.0561 ± 0.0087 ms 0.0552 ± 0.0071 ms 1.02
utils/trees/convert/preserve_sharing 0.239 ± 0.013 ms 0.24 ± 0.014 ms 0.998
utils/trees/copy/break_sharing 0.0518 ± 0.014 ms 0.0523 ± 0.0086 ms 0.991
utils/trees/copy/preserve_sharing 0.236 ± 0.019 ms 0.239 ± 0.014 ms 0.987
utils/trees/count_constants/break_sharing 27.8 ± 1.9 μs 27.9 ± 1.4 μs 0.996
utils/trees/count_depth/break_sharing 27.4 ± 3.7 μs 27.7 ± 1.7 μs 0.989
utils/trees/count_nodes/break_sharing 26 ± 1.7 μs 25.6 ± 1.4 μs 1.02
utils/trees/get_set_constants!/break_sharing 0.105 ± 0.0048 ms 0.108 ± 0.0046 ms 0.977
utils/trees/has_constants/break_sharing 9.08 ± 0.77 μs 9.45 ± 0.78 μs 0.961
utils/trees/has_operators/break_sharing 2.36 ± 0.1 μs 2.33 ± 0.1 μs 1.01
utils/trees/index_constants/break_sharing 0.0827 ± 0.0045 ms 0.0835 ± 0.0047 ms 0.99
utils/trees/is_constant/break_sharing 8.65 ± 0.78 μs 8.45 ± 0.7 μs 1.02
utils/trees/simplify_tree/break_sharing 0.17 ± 0.019 ms 0.172 ± 0.018 ms 0.986

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR. Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

MilesCranmer commented 1 year ago

Hm, probably worth having a larger n evaluation benchmark. Doesn’t seem to be noticeable at the current size

MilesCranmer commented 1 year ago

Maybe I should create a case-switch over array size, and get the optimal unroll length accordingly.

MilesCranmer commented 1 year ago

Seem to get LoopVectorization.jl segfaults on Windows. Should probably disable it there.

MilesCranmer commented 1 month ago

Closing as LoopVectorization is deprecated.