Faster NaN checks - Githubissues

MilesCranmer commented 1 year ago

Thanks to @mikmoore (https://discourse.julialang.org/t/fastest-way-to-check-for-inf-or-nan-in-an-array/76954/33?u=milescranmer)

coveralls commented 1 year ago

Pull Request Test Coverage Report for Build 5032237326

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

5 of 5 (100.0%) changed or added relevant lines in 1 file are covered.
1 unchanged line in 1 file lost coverage.
Overall coverage increased (+5.6%) to 94.043%

Files with Coverage Reduction	New Missed Lines	%
src/Utils.jl	1	75.51%
<!--	Total:	1		-->

Totals
Change from base Build 4958507899:	5.6%
Covered Lines:	963
Relevant Lines:	1024

💛 - Coveralls

github-actions[bot] commented 1 year ago

Benchmark Results

	master	643e7fdc38fff3...	t[master]/t[643e7fdc38fff3...]
eval/derivative/Float32/100/standard	2.35 ± 0.078 ms	2.35 ± 0.2 ms	0.998
eval/derivative/Float32/100/turbo	2.47 ± 0.23 ms	2.28 ± 0.19 ms	1.08
eval/derivative/Float32/1000/standard	17.2 ± 0.8 ms	16.9 ± 1.1 ms	1.02
eval/derivative/Float32/1000/turbo	18.7 ± 1.4 ms	16.7 ± 0.93 ms	1.12
eval/derivative/Float32/10000/standard	0.163 ± 0.0048 s	0.163 ± 0.0053 s	1
eval/derivative/Float32/10000/turbo	0.174 ± 0.0055 s	0.163 ± 0.0047 s	1.07
eval/evaluation/ComplexF32/1000/standard	10.1 ± 0.72 ms	9.77 ± 0.82 ms	1.04
eval/evaluation/ComplexF64/1000/standard	12.4 ± 0.9 ms	12.6 ± 0.92 ms	0.984
eval/evaluation/Float32/100/standard	0.539 ± 0.041 ms	0.539 ± 0.038 ms	0.999
eval/evaluation/Float32/100/turbo	0.31 ± 0.046 ms	0.333 ± 0.047 ms	0.929
eval/evaluation/Float32/1000/standard	3.46 ± 0.26 ms	3.64 ± 0.36 ms	0.951
eval/evaluation/Float32/1000/turbo	0.885 ± 0.091 ms	0.844 ± 0.073 ms	1.05
eval/evaluation/Float32/10000/standard	31.4 ± 2.7 ms	0.0331 ± 0.0031 s	0.95
eval/evaluation/Float32/10000/turbo	5.63 ± 0.46 ms	5.12 ± 0.37 ms	1.1
eval/evaluation/Float64/1000/standard	4.23 ± 0.32 ms	4.4 ± 0.42 ms	0.962
eval/evaluation/Float64/1000/turbo	1.42 ± 0.12 ms	1.32 ± 0.11 ms	1.07
time_to_load	14.4 ± 0.018 s	14.6 ± 0.16 s	0.989
utils/extra/is_bad_array_x16/50	0.283 ± 0.0066 μs	0.13 ± 0.016 μs	2.18
utils/extra/is_bad_array_x16/500	1.11 ± 0.02 μs	0.745 ± 0.085 μs	1.49
utils/extra/is_bad_array_x16/5000	18.6 ± 2.7 μs	6.69 ± 0.96 μs	2.78
utils/trees/combine_operators/break_sharing	0.0619 ± 0.0047 ms	0.0616 ± 0.0048 ms	1
utils/trees/convert/break_sharing	0.0561 ± 0.0087 ms	0.0552 ± 0.0071 ms	1.02
utils/trees/convert/preserve_sharing	0.239 ± 0.013 ms	0.24 ± 0.014 ms	0.998
utils/trees/copy/break_sharing	0.0518 ± 0.014 ms	0.0523 ± 0.0086 ms	0.991
utils/trees/copy/preserve_sharing	0.236 ± 0.019 ms	0.239 ± 0.014 ms	0.987
utils/trees/count_constants/break_sharing	27.8 ± 1.9 μs	27.9 ± 1.4 μs	0.996
utils/trees/count_depth/break_sharing	27.4 ± 3.7 μs	27.7 ± 1.7 μs	0.989
utils/trees/count_nodes/break_sharing	26 ± 1.7 μs	25.6 ± 1.4 μs	1.02
utils/trees/get_set_constants!/break_sharing	0.105 ± 0.0048 ms	0.108 ± 0.0046 ms	0.977
utils/trees/has_constants/break_sharing	9.08 ± 0.77 μs	9.45 ± 0.78 μs	0.961
utils/trees/has_operators/break_sharing	2.36 ± 0.1 μs	2.33 ± 0.1 μs	1.01
utils/trees/index_constants/break_sharing	0.0827 ± 0.0045 ms	0.0835 ± 0.0047 ms	0.99
utils/trees/is_constant/break_sharing	8.65 ± 0.78 μs	8.45 ± 0.7 μs	1.02
utils/trees/simplify_tree/break_sharing	0.17 ± 0.019 ms	0.172 ± 0.018 ms	0.986

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR. Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

MilesCranmer commented 1 year ago

Hm, probably worth having a larger n evaluation benchmark. Doesn’t seem to be noticeable at the current size

MilesCranmer commented 1 year ago

Maybe I should create a case-switch over array size, and get the optimal unroll length accordingly.

MilesCranmer commented 1 year ago

Seem to get LoopVectorization.jl segfaults on Windows. Should probably disable it there.

MilesCranmer commented 1 month ago

Closing as LoopVectorization is deprecated.

SymbolicML / DynamicExpressions.jl

Faster NaN checks #34

Pull Request Test Coverage Report for Build 5032237326

Warning: This coverage report may be inaccurate.

Details

💛 - Coveralls

Benchmark Results

Benchmark Plots