Open cgyurgyik opened 4 years ago
To give you an idea, here's an attempt to modernize some of the C code in KeplerSTM_C
. I am a big fan of using provided algorithms in the C++ library <algorithm>
unless benchmarks say otherwise. Using const
helps compiler efficiency, as well as ensuring you don't change certain variables. std::array
is safer than C arrays.
// NOTE: NOT VERIFIED FOR CORRECTNESS, NOR HAS IT BEEN TESTED.
int KeplerSTM_C (const std::array<double, 6>& x0, double dt, double mu,
const std::array<double, 6>& x1, double epsmult) {
/* Initialize orbit values*/
const std::array<double, 3> r0 = {x0[0], x0[1], x0[2]};
const std::array<double, 3> v0 = {x0[3], x0[4], x0[5]};
const double dot_r0 = std::inner_product(std::cbegin(r0), std::cend(r0), std::cbegin(r0), std::cend(r0), 0.0);
const double r0norm = std::sqrt(dot_r0);
const double nu0 = std::inner_product(std::cbegin(r0), std::cend(r0), std::cbegin(v0), std::cend(v0), 0.0);
const double dot_v0 = std::inner_product(std::cbegin(v0), std::cend(v0), std::cbegin(v0), std::cend(v0), 0.0);
const double beta = 2.0 * mu / r0norm - dot_v0;
/* For elliptic orbits, account for period effects */
double deltaU = 0;
if (beta > 0){
const double P = 2.0 * M_PI * mu * std::pow(beta,(-1.5));
const double norb = std::floor((dt + P / 2.0 - 2.0 * nu0 / beta) / P);
deltaU = 2.0 * M_PI * norb * std::pow(beta,(-2.5));
}
/*...*/
If you give me some reasonable tests or function parameters for KeplerSTM_C
that hit all use cases, I would be happy to run some benchmarks as well.
You are welcome to submit a pull request with an associated benchmark showing the performance improvement.
Sure. I don't think it would be sufficient to benchmark on random data. Do you have any provided test data for these functions?
See propag_system in Prototypes/SimulatedUniverse. Relevant scales are up to 1 million propagations at a time.
*We only propagate 1 system at a time. Propagation only occurs when making a revisit of a target star. We make at most 1,400 single-visit observations in a mission simulation currently. propag_system is only used in ~10 places in EXOSIMS. This could improve dynamic completeness computation time I think and be generally helpful for revisits (Very helpful for this reason alone. Do it!). @CoreySpohn had some dynamic completeness improvements as well
I am not sure what "dynamic completeness" entails. Here are initial benchmarks using some more modern C++ and other quick fixes. The code still has a long ways to go in terms of readability / removing code smells as well, and there are probably other fixes that can improve the speed.
NOTE: This is using arbitrary numbers:
std::array<double, 6> x0 = {1.0,1.0,1.0,1.0,1.0,1.0};
const double dt = 3.0;
const double mu = 5.0;
std::array<double, 6> x1 = {2.0,2.0,2.0,2.0,2.0,2.0};
const double epsmult = 4.0;
const int convergence_limit = 1000;
2020-01-26 18:57:54
Run on (4 X 1600 MHz CPU s)
CPU Caches:
L1 Data 32 KiB (x2)
L1 Instruction 32 KiB (x2)
L2 Unified 256 KiB (x2)
L3 Unified 3072 KiB (x1)
Load Average: 4.15, 3.13, 3.06
***WARNING*** Library was built as DEBUG. Timings may be affected.
--------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------
Old_KeplerSTM 9422 ns 8759 ns 74960
New_KeplerSTM 7612 ns 7298 ns 84318
The next step is to see what Cython has to offer in terms of modern C++ features, and then apply it to the current codebase. I also intend to contribute to Cython since they lack many of the powerful algorithms that C++ provides.
Fold into #333
Looking at:
EXOSIMS/EXOSIMS/util/KeplerSTM_C/KeplerSTM_C.c
I saw:
double r0norm = sqrt(pow(r0[0], 2)+pow(r0[1], 2)+pow(r0[2], 2));
and similar cases wherestd::pow(x, n)
is used, and n is small. Is there a specific reason for this?See: "When to Use std::pow" for reasons not to use std::pow when n is small.
There's also a few other updates that could be made using C++ algorithms/idioms, as well as making the code "safer" and more readable. Not sure if that interests you at all.