FloatingArrayDesign / MoorDyn

a lumped-mass mooring line model intended for coupling with floating structure codes
https://moordyn.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
73 stars 42 forks source link

State and Line Performance Improvements #269

Open AlexWKinley opened 2 weeks ago

AlexWKinley commented 2 weeks ago

This PR contains changes to a few time integrator and to lines with the goal of not changing the simulation is any meaningful way, while providing performance improvements.

Performance Results

Overall, for models of lines, there is roughly a 2x performance improvement. Currently, the integrator performance improvements are mostly specific to RK2 and RK4.

RK2 RK4

line_performance_plots

Integrator / State Changes

The primary changes to the integrator and state code are to avoid memory allocations.

The state code that allows writing

r[1] = r[0] + rd[0] * (0.5 * dt);

is very convenient, but its current implementation means that every operation on a state allocates memory to create an entire copy of that state. That basic $y_{n+1} = y_n + dy \Delta t$ line allocates memory to store the result of $dy \Delta t$, and then the result of the addition, and even an additional allocation for the assignment.

My current solution to this is the new butcher_row function that can do these sorts of computations in place, without allocation any additional memory. Named as such because it does the math for a single row in a Butcher tableau. With this function we get

// r[1] = r[0] + rd[0] * (0.5 * dt);
butcher_row<1>(r[1], r[0], { 0.5 * dt }, { &rd[0] });

Definitely not a clear and concise as the more mathematical notation, but with significant performance improvements. Because of the additional complexity, I've only switch the RK2 and RK4 integrator to use this function. But I can certainly expand its usage if desired.

The other change to states I've made is to remove the empty constructor and destructor from StateVar and StateVarDeriv. Because of the rule of three/five defining a destructor (even an empty one), causes an expression like r[1] = r[0] + rd[0] to allocate twice, even though the result of r[0] + rd[0] can be directly assigned to r[1]. Removing the empty destructor allows the compiler to avoid that second allocation.

This should help other integrators even if they're not using butcher_row, but not as significantly as using butcher_row.

Line Changes

The first type of changes to lines are again to avoid memory allocations.

Changing line->getStateDeriv() to take references to the node velocities and accelerations avoids having to allocate memory for those results every time. Also changing Line::setState(const std::vector<vec>& pos, const std::vector<vec>& vel) to take vector references avoids it allocation copies of the position and velocities vectors.

The other kind of change to lines are my attempts at simplifying the code while improving performance. The biggest thing is getting rid of many of the

if (i == 0) 
    ....
else if (i == N)
   ....
else
   ....

by calculating the values that depend on whether it's an internal or external node once, allowing for them to be reused, and making the code nicer to read.

Misc

To improve consistency of the benchmarks, and avoid filling up the console with simulation times, I added MoorDyn::SetDisableOutput to disable some of the console and file output.

Logistical Notes

Feel free to share any thoughts/questions you have. I know that those at NREL are doing some work from their side that could potentially create some conflicts with some of these changes. I'm happy to delay merging this and rebasing on top of whatever those changes may be myself if that would be easiest.

sanguinariojoe commented 1 week ago

Hey!

I am out of office. I can give feedback by the end of the month

On Wed, 6 Nov 2024, 21:18 AlexWKinley, @.***> wrote:

This PR contains changes to a few time integrator and to lines with the goal of not changing the simulation is any meaningful way, while providing performance improvements. Performance Results

Overall, for models of lines, there is roughly a 2x performance improvement. Currently, the integrator performance improvements are mostly specific to RK2 and RK4.

RK2.png (view on web) https://github.com/user-attachments/assets/108a05f6-7953-4a8e-944e-a1224a32f096 RK4.png (view on web) https://github.com/user-attachments/assets/9a9b8380-648b-462b-9fe6-27a738f42479

line_performance_plots.png (view on web) https://github.com/user-attachments/assets/eb3841dd-ce01-42f1-a011-5ce9780941b3 Integrator / State Changes

The primary changes to the integrator and state code are to avoid memory allocations.

The state code that allows writing

r[1] = r[0] + rd[0] (0.5 dt);

is very convenient, but its current implementation means that every operation on a state allocates memory to create an entire copy of that state. That basic $y_{n+1} = y_n + dy \Delta t$ line allocates memory to store the result of $dy \Delta t$, and then the result of the addition, and even an additional allocation for the assignment.

My current solution to this is the new butcher_row function that can do these sorts of computations in place, without allocation any additional memory. Named as such because it does the math for a single row in a Butcher tableau https://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods#Explicit_Runge.E2.80.93Kutta_methods . With this function we get

// r[1] = r[0] + rd[0] (0.5 dt); butcher_row<1>(r[1], r[0], { 0.5 * dt }, { &rd[0] });

Definitely not a clear and concise as the more mathematical notation, but with significant performance improvements. Because of the additional complexity, I've only switch the RK2 and RK4 integrator to use this function. But I can certainly expand its usage if desired.

The other change to states I've made is to remove the empty constructor and destructor from StateVar and StateVarDeriv. Because of the rule of three/five https://en.cppreference.com/w/cpp/language/rule_of_three defining a destructor (even an empty one), causes an expression like r[1] = r[0] + rd[0] to allocate twice, even though the result of r[0] + rd[0] can be directly assigned to r[1]. Removing the empty destructor allows the compiler to avoid that second allocation.

This should help other integrators even if they're not using butcher_row, but not as significantly as using butcher_row. Line Changes

The first type of changes to lines are again to avoid memory allocations.

Changing line->getStateDeriv() to take references to the node velocities and accelerations avoids having to allocate memory for those results every time. Also changing Line::setState(const std::vector& pos, const std::vector& vel) to take vector references avoids it allocation copies of the position and velocities vectors.

The other kind of change to lines are my attempts at simplifying the code while improving performance. The biggest thing is getting rid of many of the

if (i == 0) ....else if (i == N) ....else ....

by calculating the values that depend on whether it's an internal or external node once, allowing for them to be reused, and making the code nicer to read. Misc

To improve consistency of the benchmarks, and avoid filling up the console with simulation times, I added MoorDyn::SetDisableOutput to disable some of the console and file output. Logistical Notes

Feel free to share any thoughts/questions you have. I know that those at NREL are doing some work from their side that could potentially create some conflicts with some of these changes. I'm happy to delay merging this and rebasing on top of whatever those changes may be myself if that would be easiest.

You can view, comment on, or merge this pull request online at:

https://github.com/FloatingArrayDesign/MoorDyn/pull/269 Commit Summary

File Changes

(12 files https://github.com/FloatingArrayDesign/MoorDyn/pull/269/files)

Patch Links:

— Reply to this email directly, view it on GitHub https://github.com/FloatingArrayDesign/MoorDyn/pull/269, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMXKKBQIDCHWBYFFH3GWN3Z7J2RNAVCNFSM6AAAAABRJUOV6GVHI2DSMVQWIX3LMV43ASLTON2WKOZSGYZTSMJSGU4DKNA . You are receiving this because you are subscribed to this thread.Message ID: @.***>