ericagol / NbodyGradient.jl

N-body integrator computes derivatives with respect to initial conditions for TTVs, RV, Photodynamics & more
MIT License
20 stars 9 forks source link

TTV computation comparisons #8

Open ericagol opened 6 years ago

ericagol commented 6 years ago
ericagol commented 3 years ago

We compared the integrator with the C implementation of Hernandez & Bertschinger. The Julia version is ~5-10 times slower. This is pretty surprising - in my experience - and based on looking at discussions in discourse - it's possible to get Julia code to run at C speeds.

@langfzac @dmhernan I think we should: 1). Make a comparison of the universal.c to the julia implementation, and see whether this is where most of the time difference originates. 2). Ask for help on julia discourse with an example.

giordano commented 3 years ago

Without having seen the code, I can tell that usually making the function type-stable and reducing useless memory allocations is most of the work.

dmhernan commented 3 years ago

@ericagol @langfzac Sure, fine by me. To isolate the Kepler solver in a problem, all pairs could be placed in Kepler group. The drifts should have negligible compute cost relative to the Kepler solver. To isolate the Kepler solver even further, it can be used to integrate a two-body problem over long times. I will send code demonstrating this.

ericagol commented 3 years ago

@giordano Yes, we've done that - but maybe we missed something, or could possibly could do a better job!

ericagol commented 3 years ago

@dmhernan Thanks!

ericagol commented 3 years ago

The comparison with universal.c was useful - thanks to a few small changes, we can now match the C speed. See this pull request (an older version of NbodyGradient was used in the TRAPPIST1_Spitzer repository since the tests were done in Julia v1.0.1, which the current repository is no longer compatible with):

https://github.com/ericagol/TRAPPIST1_Spitzer/commit/0a3c35c91801e845958ad28def71a0069e291954

outer_ss$ julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.0.1 (2018-09-29)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> include("test.jl")
test_kepler_step (generic function with 1 method)

julia> test_kepler_step(1)
Relative energy error after step 0.01 is dE/E= 3.5527136788004e-15
Final state: [0.0, -0.0480049, 0.047346, 0.0, -4.97779, 1.97086, 0.0, 0.0674248, 4.92801, 1.0, 0.34227, -1.57501e-12]

julia> @time test_kepler_step(10000000)
Relative energy error after step 0.01 is dE/E= -5.443423489736988e-13
Final state: [0.0, -1.98984, 0.00253468, 0.0, -0.0090298, -0.0708824, 0.0, 1.98984, 0.0089395, 1.0, 0.00502565, -3.73126e-16]
  1.480160 seconds (151 allocations: 4.766 KiB)

julia>
outer_ss$ time ./test2
Relative energy error after step 0.01 is dE/E=1.83624e-11
Final state: -1.98984 0.00253463 0 -0.00902962 -0.0708824 0

real    0m1.539s
user    0m1.534s
sys     0m0.003s
ericagol commented 3 years ago

(These two codes are currently not in the repository - please let me know if you would like to see them. The version of universal.c which was used for this test is outdated, hence the poorer conservation of energy and slight disagreement between the final states.)

ericagol commented 3 years ago

So, to improve the speed of the ah18/ttv code, we need to avoid array allocations (allocation of x0 & v0 in kepler_step was causing the biggest slow down in the above test, even though these are only 3 elements each). What would be the best way to do this? Create a set of cached arrays for use within the module? Hold them in a structure which is passed through functions at different levels of the algorithm? It would be nice if there were some way in Julia to pre-allocate some arrays for a function before it is called which will avoid the allocation and garbage collection each time the function is called.

langfzac commented 3 years ago

@ericagol I've already done some of these things in my fork. I use the State type to hold preallocated arrays for the compensated summation. There's also PreAllocArrays types for the derivative arrays. This just needs to be extended to the kepler_step/drift functions (on my to-do list..). A straight forward (but perhaps less elegant) way to do this would be to just append the relevant arrays to State, since it's already passed through the integrator.

ericagol commented 3 years ago

@giordano Thanks for the suggestion, Mosé! It did turn out that pre-allocation was causing most of the slow down. Also found some other improvements which match C speed in this test of kepler_step using the DH17 algorithm. Now @langfzac is trying to remove the pre-allocations for the full integrator with the updated algorithm (right now called AH18, but still yet to be published!).

giordano commented 3 years ago

I'm glad you worked this out! 😃

dmhernan commented 3 years ago

Wow, interesting feature of Julia.

ericagol commented 3 years ago

@dmhernan Yes, and not a feature I am fond of...