Closed Kolaru closed 5 years ago
Merging #97 into master will decrease coverage by
3.4%
. The diff coverage is58.74%
.
@@ Coverage Diff @@
## master #97 +/- ##
==========================================
- Coverage 59.58% 56.17% -3.41%
==========================================
Files 11 12 +1
Lines 532 623 +91
==========================================
+ Hits 317 350 +33
- Misses 215 273 +58
Impacted Files | Coverage Δ | |
---|---|---|
src/IntervalRootFinding.jl | 5.55% <ø> (ø) |
:arrow_up: |
src/branch_and_bound.jl | 38.82% <38.82%> (ø) |
|
src/roots.jl | 72.88% <85%> (-8.16%) |
:arrow_down: |
src/contractors.jl | 95.55% <94.44%> (-1.95%) |
:arrow_down: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update becade4...864a337. Read the comment docs.
This looks great, thanks a lot!
I am a bit worried about the performance implications of using a tree like that. Have you done any benchmarks?
In general, I think we should try to leave as general as possible the data structures used, so you could specify that you wanted to store the nodes in a tree, or a vector as before, etc. How possible would that be?
So I quickly added some benchmark using the Smiley and Chun examples introduced by @gwater and the PkgBenchmark.jl
package (this PR is necessary to run its basics tasks though). Here are the results:
Master:
ID | time | GC time | memory | allocations |
---|---|---|---|---|
["Smiley", "Krawczyk", "Smiley and Chun (2001), Example 2.2"] |
12.457 ms (5%) | 8.41 MiB (1%) | 250556 | |
["Smiley", "Krawczyk", "Smiley and Chun (2001), Example 5.2"] |
215.253 ms (5%) | 24.078 ms | 115.33 MiB (1%) | 3433780 |
["Smiley", "Krawczyk", "Smiley and Chun (2001), Example 5.4"] |
9.902 ms (5%) | 7.51 MiB (1%) | 219406 | |
["Smiley", "Newton", "Smiley and Chun (2001), Example 2.2"] |
12.983 ms (5%) | 8.76 MiB (1%) | 259155 | |
["Smiley", "Newton", "Smiley and Chun (2001), Example 5.2"] |
255.685 ms (5%) | 28.522 ms | 136.83 MiB (1%) | 4044788 |
["Smiley", "Newton", "Smiley and Chun (2001), Example 5.4"] |
12.438 ms (5%) | 9.32 MiB (1%) | 271491 |
This PR:
ID | time | GC time | memory | allocations |
---|---|---|---|---|
["Smiley", "Krawczyk", "Smiley and Chun (2001), Example 2.2"] |
12.540 ms (5%) | 8.46 MiB (1%) | 252695 | |
["Smiley", "Krawczyk", "Smiley and Chun (2001), Example 5.2"] |
226.340 ms (5%) | 27.452 ms | 116.05 MiB (1%) | 3450366 |
["Smiley", "Krawczyk", "Smiley and Chun (2001), Example 5.4"] |
9.968 ms (5%) | 7.52 MiB (1%) | 220525 | |
["Smiley", "Newton", "Smiley and Chun (2001), Example 2.2"] |
13.686 ms (5%) | 8.81 MiB (1%) | 261324 | |
["Smiley", "Newton", "Smiley and Chun (2001), Example 5.2"] |
285.322 ms (5%) | 37.126 ms | 137.49 MiB (1%) | 4064908 |
["Smiley", "Newton", "Smiley and Chun (2001), Example 5.4"] |
12.503 ms (5%) | 9.35 MiB (1%) | 272998 |
It looks like this PR implies about 5% to 10% penalty (PkgBenchmark
comparison is not yet functional for Julia 0.7/1.0, so it's the best estimate I can easily do for now). It may be possible to remove this penalty completely, since the tree structure closely correspond to a branch and bound algorithm.
On the other hand I was expecting way more allocations with this PR. Good surprise I guess.
As for allowing to store the data in arbitrary structure, while designin this PR I considerd that the ability to easily convert the result toward any data structure would be sufficient. It is also the simplest solution from the developper side.
However if the current penalty we pay to keep maximal information about the run of the algorithm is too high (seems reasonnable to me, but I essentially use the package to solve rather simple equations so I may not be representative), it is possible to make it more generic (may become tricky thought).
Wow, that's great! Let me look over this in detail then.
Do you have a simple example to show how to use the new functionality / see the tree etc?
I have been thinking that roots
should create a RootsProblem
(or whatever) object that stores all the parameters used and all the intermediate results etc.
I updated the example in the examples
dir and put it in a new file (root_search_iterator.jl
). It is abundantly commented, so I hope everything is clear.
By default, the intermediate results are not stored in the tree to spare memory. However, one of the example show how to do this by changing the process
function.
Also the search object already store all final parameters (final in the sense that the contractor is stored, but not the input parameter such as the original function or if it uses automatic differentiation for example).
Sorry, this now has conflicts :/
I can attempt to rebase this if you would like.
@dpsanders No need to, thanks. I'll do it soon.
Below is the benchmark of the current state of this PR against master. For now we see a slowdowns ranging from 5% to 20% in worst cases. I'll try to identify if there is a bottleneck and if I can do something to speed things up.
The decrease in memory usage is a welcomed suprise.
A ratio greater than 1.0
denotes a possible regression (marked with :x:), while a ratio less
than 1.0
denotes a possible improvement (marked with :white_check_mark:). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).
ID | time ratio | memory ratio |
---|---|---|
["Dietmar-Ratz", "Dietmar-Ratz 1", "Krawczyk"] |
1.13 (5%) :x: | 0.21 (1%) :white_check_mark: |
["Dietmar-Ratz", "Dietmar-Ratz 1", "Newton"] |
1.11 (5%) :x: | 0.21 (1%) :white_check_mark: |
["Dietmar-Ratz", "Dietmar-Ratz 2", "Newton"] |
1.05 (5%) :x: | 1.01 (1%) |
["Dietmar-Ratz", "Dietmar-Ratz 3", "Krawczyk"] |
1.05 (5%) :x: | 1.03 (1%) :x: |
["Dietmar-Ratz", "Dietmar-Ratz 3", "Newton"] |
1.08 (5%) :x: | 1.04 (1%) :x: |
["Dietmar-Ratz", "Dietmar-Ratz 3", "Slope expansion"] |
1.06 (5%) :x: | 1.00 (1%) |
["Dietmar-Ratz", "Dietmar-Ratz 4", "Automatic differentiation"] |
1.05 (5%) :x: | 1.00 (1%) |
["Dietmar-Ratz", "Dietmar-Ratz 4", "Krawczyk"] |
1.05 (5%) :x: | 0.98 (1%) :white_check_mark: |
["Dietmar-Ratz", "Dietmar-Ratz 4", "Newton"] |
1.04 (5%) | 0.96 (1%) :white_check_mark: |
["Dietmar-Ratz", "Dietmar-Ratz 5", "Krawczyk"] |
1.18 (5%) :x: | 0.19 (1%) :white_check_mark: |
["Dietmar-Ratz", "Dietmar-Ratz 5", "Newton"] |
1.14 (5%) :x: | 0.19 (1%) :white_check_mark: |
["Dietmar-Ratz", "Dietmar-Ratz 6", "Krawczyk"] |
1.04 (5%) | 0.90 (1%) :white_check_mark: |
["Dietmar-Ratz", "Dietmar-Ratz 6", "Newton"] |
1.04 (5%) | 0.94 (1%) :white_check_mark: |
["Dietmar-Ratz", "Dietmar-Ratz 7", "Automatic differentiation"] |
1.06 (5%) :x: | 1.00 (1%) |
["Dietmar-Ratz", "Dietmar-Ratz 7", "Krawczyk"] |
1.12 (5%) :x: | 0.33 (1%) :white_check_mark: |
["Dietmar-Ratz", "Dietmar-Ratz 7", "Newton"] |
1.09 (5%) :x: | 0.33 (1%) :white_check_mark: |
["Dietmar-Ratz", "Dietmar-Ratz 9", "Krawczyk"] |
1.09 (5%) :x: | 0.18 (1%) :white_check_mark: |
["Dietmar-Ratz", "Dietmar-Ratz 9", "Newton"] |
1.10 (5%) :x: | 0.18 (1%) :white_check_mark: |
["Rastigrin stationary points", "Krawczyk"] |
1.06 (5%) :x: | 1.00 (1%) |
["Rastigrin stationary points", "Newton"] |
1.09 (5%) :x: | 1.01 (1%) |
["Smiley", "Smiley and Chun (2001), Example 5.2", "Krawczyk"] |
1.07 (5%) :x: | 1.01 (1%) |
["Smiley", "Smiley and Chun (2001), Example 5.2", "Newton"] |
1.09 (5%) :x: | 1.01 (1%) |
["Smiley", "Smiley and Chun (2001), Example 5.4", "Newton"] |
1.06 (5%) :x: | 1.01 (1%) |
Here's a list of all the benchmark groups executed by this job:
["Dietmar-Ratz", "Dietmar-Ratz 1"]
["Dietmar-Ratz", "Dietmar-Ratz 2"]
["Dietmar-Ratz", "Dietmar-Ratz 3"]
["Dietmar-Ratz", "Dietmar-Ratz 4"]
["Dietmar-Ratz", "Dietmar-Ratz 5"]
["Dietmar-Ratz", "Dietmar-Ratz 6"]
["Dietmar-Ratz", "Dietmar-Ratz 7"]
["Dietmar-Ratz", "Dietmar-Ratz 8"]
["Dietmar-Ratz", "Dietmar-Ratz 9"]
["Linear equations", "n = 10"]
["Linear equations", "n = 2"]
["Linear equations", "n = 5"]
["Rastigrin stationary points"]
["Smiley", "Smiley and Chun (2001), Example 2.2"]
["Smiley", "Smiley and Chun (2001), Example 5.2"]
["Smiley", "Smiley and Chun (2001), Example 5.4"]
Julia Version 1.1.0
Commit 80516ca202 (2019-01-21 21:24 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
Microsoft Windows [version 10.0.17763.316]
CPU: Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz:
speed user nice sys idle irq
#1 3504 MHz 41728312 0 24420343 250501062 4959093 ticks
#2 3504 MHz 42282593 0 17158515 257208390 409375 ticks
#3 3504 MHz 61348546 0 15767125 239533843 290093 ticks
#4 3504 MHz 47695703 0 16460234 252493578 311796 ticks
Memory: 15.95343017578125 GB (8507.94140625 MB free)
Uptime: 622735.0 sec
Load Avg: 0.0 0.0 0.0
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
Julia Version 1.1.0
Commit 80516ca202 (2019-01-21 21:24 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
Microsoft Windows [version 10.0.17763.316]
CPU: Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz:
speed user nice sys idle irq
#1 3504 MHz 41786609 0 24435312 250581875 4961656 ticks
#2 3504 MHz 42332593 0 17169453 257301531 409421 ticks
#3 3504 MHz 61406390 0 15776750 239620453 290281 ticks
#4 3504 MHz 47762937 0 16471140 252569515 311968 ticks
Memory: 15.95343017578125 GB (8257.20703125 MB free)
Uptime: 622889.0 sec
Load Avg: 0.0 0.0 0.0
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
It seems like most of the overhead comes from the creation of the tree and the associated dictionnaries. However due to the profiler sometime crashing on Windows (https://github.com/JuliaLang/julia/issues/30902) I have not yet been able to do extensive profiling.
Also note that the build failures only concern Julia 0.7.
I would say that we should just remove support for Julia 0.7 and merge this.
Since the implementation is accepted, I can now do the tedious but necessary secondary work (add tests, check the examples are still valid and write some beautiful documentation). Hopefully I will be able to do it soon and then it will be ready for merging.
@dpsanders This is now ready.
Could you please update REQUIRE to put Julia 1.0 as the lower bound. Or change the settings on your fork to allow me to modify your PR.
@dpsanders Done. I've also remove 0.7 build from travis and appveyor.
This looks really great, thanks!
I think the best thing to do is merge once green and then iterate later if necessary.
I think some of the printing etc. could probably be simplified using https://github.com/Keno/AbstractTrees.jl
I'm not sure if there are any ready-made packages for trees that could simplify some of the tree logic itself.
OK I'm going to merge for now.
Thanks a lot!
@dpsanders When I started working on this PR I checked for such packages. The closer which seems available are networks from packages such as LightGraphs.jl
. However I preferred to implement my own tree type, as networks are more general and I was a bit worried about performance because of that.
I'll look into simplifying some part of this with AbstractTree.jl
once I've dealt with my other pending PR.
Yes, true, LightGraphs.jl
has a lot of relevant functionality. There's no rush in any case!
I finally found the courage to update to 0.7/1.0 and to implement the idea proposed by @dpsanders in #72 to separate the logic of the search from the rest of the package. Therefore this PR superseed #72 (I do not update it though, as the scope is quite different).
Changes in this PR:
roots
function still return a simple list of intervals.rootsearch_iterator.jl
file, so the transition to the code in another package can be done easily.Root
object during the search rather than interval, which is forced by the genericity of the branch and bound algorithm (status of roots does not exactly match with the status of the elements in a search).