JuliaAlgebra / StaticPolynomials.jl

Fast evaluation of multivariate polynomials
https://juliaalgebra.github.io/StaticPolynomials.jl/latest/
Other
16 stars 4 forks source link

Faster gradient #10

Closed saschatimme closed 6 years ago

saschatimme commented 6 years ago

I noticed an inefficiency in the gradient routine. It could (and did) happen that we evaluated polynomials where the leading coefficient was 0.

saschatimme commented 6 years ago

Here is a benchmark:

Benchmark Report for StaticPolynomials

Job Properties

Results

A ratio greater than 1.0 denotes a possible regression (marked with :x:), while a ratio less than 1.0 denotes a possible improvement (marked with :white_check_mark:). Only significant results - results that indicate possible regressions or improvements - are shown below (thus, an empty table means that all benchmark results remained invariant between builds).

ID time ratio memory ratio
["evaluate", "Float64", "chandra4"] 1.18 (5%) :x: 1.00 (1%)
["evaluate", "Float64", "katsura5"] 0.84 (5%) :white_check_mark: 1.00 (1%)
["evaluate", "Float64", "katsura7"] 1.10 (5%) :x: 1.00 (1%)
["jacobian", "Complex{Float64}", "chandra4"] 0.94 (5%) :white_check_mark: 1.00 (1%)
["jacobian", "Complex{Float64}", "chandra5"] 0.92 (5%) :white_check_mark: 1.00 (1%)
["jacobian", "Complex{Float64}", "cyclic5"] 0.92 (5%) :white_check_mark: 1.00 (1%)
["jacobian", "Complex{Float64}", "cyclic6"] 0.90 (5%) :white_check_mark: 1.00 (1%)
["jacobian", "Complex{Float64}", "cyclic7"] 0.95 (5%) :white_check_mark: 1.00 (1%)
["jacobian", "Complex{Float64}", "cyclic8"] 0.91 (5%) :white_check_mark: 1.00 (1%)
["jacobian", "Complex{Float64}", "katsura10"] 0.69 (5%) :white_check_mark: 1.00 (1%)
["jacobian", "Complex{Float64}", "katsura5"] 0.81 (5%) :white_check_mark: 1.00 (1%)
["jacobian", "Complex{Float64}", "katsura6"] 0.76 (5%) :white_check_mark: 1.00 (1%)
["jacobian", "Complex{Float64}", "katsura7"] 0.72 (5%) :white_check_mark: 1.00 (1%)
["jacobian", "Complex{Float64}", "katsura8"] 0.69 (5%) :white_check_mark: 1.00 (1%)
["jacobian", "Complex{Float64}", "katsura9"] 0.69 (5%) :white_check_mark: 1.00 (1%)
["jacobian", "Complex{Float64}", "rps10"] 0.74 (5%) :white_check_mark: 1.00 (1%)
["jacobian", "Float64", "chandra5"] 0.87 (5%) :white_check_mark: 1.00 (1%)
["jacobian", "Float64", "cyclic7"] 0.93 (5%) :white_check_mark: 1.00 (1%)
["jacobian", "Float64", "cyclic8"] 0.89 (5%) :white_check_mark: 1.00 (1%)
["jacobian", "Float64", "fourbar"] 0.91 (5%) :white_check_mark: 1.00 (1%)
["jacobian", "Float64", "katsura10"] 0.85 (5%) :white_check_mark: 1.00 (1%)
["jacobian", "Float64", "katsura7"] 0.90 (5%) :white_check_mark: 1.00 (1%)
["jacobian", "Float64", "katsura8"] 0.93 (5%) :white_check_mark: 1.00 (1%)
["jacobian", "Float64", "katsura9"] 0.85 (5%) :white_check_mark: 1.00 (1%)
["jacobian", "Float64", "rps10"] 1.22 (5%) :x: 1.00 (1%)
["static jacobian", "Complex{Float64}", "chandra5"] 0.94 (5%) :white_check_mark: 1.00 (1%)
["static jacobian", "Complex{Float64}", "cyclic5"] 0.95 (5%) :white_check_mark: 1.00 (1%)
["static jacobian", "Complex{Float64}", "fourbar"] 0.93 (5%) :white_check_mark: 1.00 (1%)
["static jacobian", "Complex{Float64}", "katsura10"] 0.75 (5%) :white_check_mark: 1.00 (1%)
["static jacobian", "Complex{Float64}", "katsura5"] 0.84 (5%) :white_check_mark: 1.00 (1%)
["static jacobian", "Complex{Float64}", "katsura6"] 0.81 (5%) :white_check_mark: 1.00 (1%)
["static jacobian", "Complex{Float64}", "katsura7"] 0.89 (5%) :white_check_mark: 1.00 (1%)
["static jacobian", "Complex{Float64}", "katsura8"] 0.92 (5%) :white_check_mark: 1.00 (1%)
["static jacobian", "Complex{Float64}", "katsura9"] 0.90 (5%) :white_check_mark: 1.00 (1%)
["static jacobian", "Complex{Float64}", "rps10"] 0.72 (5%) :white_check_mark: 1.00 (1%)
["static jacobian", "Float64", "chandra5"] 0.91 (5%) :white_check_mark: 1.00 (1%)
["static jacobian", "Float64", "cyclic5"] 0.92 (5%) :white_check_mark: 1.00 (1%)
["static jacobian", "Float64", "cyclic7"] 0.90 (5%) :white_check_mark: 1.00 (1%)
["static jacobian", "Float64", "cyclic8"] 0.88 (5%) :white_check_mark: 1.00 (1%)
["static jacobian", "Float64", "fourbar"] 1.07 (5%) :x: 1.00 (1%)
["static jacobian", "Float64", "katsura5"] 0.93 (5%) :white_check_mark: 1.00 (1%)
["static jacobian", "Float64", "katsura6"] 0.85 (5%) :white_check_mark: 1.00 (1%)
["static jacobian", "Float64", "katsura9"] 0.91 (5%) :white_check_mark: 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

Julia versioninfo

Target

Julia Version 0.6.1
Commit 0d7248e2ff (2017-10-24 22:15 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i5-4590 CPU @ 3.30GHz
  WORD_SIZE: 64
           "openSUSE Leap 42.3"
  uname: Linux 4.4.120-45-default #1 SMP Wed Mar 14 20:51:49 UTC 2018 (623211f) x86_64 x86_64
Memory: 15.560203552246094 GB (283.09765625 MB free)
Uptime: 2.774538e6 sec
Load Avg:  1.22314453125  0.97900390625  0.74609375
Intel(R) Core(TM) i5-4590 CPU @ 3.30GHz: 
       speed         user         nice          sys         idle          irq
#1  3638 MHz    3188261 s       5043 s     582893 s  272475714 s          0 s
#2  3500 MHz    2952696 s       3668 s     466830 s  273536119 s          0 s
#3  3699 MHz    3001340 s       6864 s     478103 s  273467661 s          0 s
#4  3601 MHz    2946078 s       6810 s     464624 s  273599895 s          0 s

  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, haswell)
saschatimme commented 6 years ago

The results are still quite noisy. On my machine ["jacobian", "Float64", "rps10"] reduced from 629 ns to 535 ns instead of an increase.

codecov-io commented 6 years ago

Codecov Report

Merging #10 into master will increase coverage by 0.86%. The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #10      +/-   ##
==========================================
+ Coverage   86.38%   87.25%   +0.86%     
==========================================
  Files          10       10              
  Lines         448      455       +7     
==========================================
+ Hits          387      397      +10     
+ Misses         61       58       -3
Impacted Files Coverage Δ
src/codegen_helpers.jl 86.44% <100%> (+3.9%) :arrow_up:
src/evaluate_codegen.jl 100% <100%> (ø) :arrow_up:
src/gradient_codegen.jl 100% <100%> (ø) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 255a999...23f9a0d. Read the comment docs.