headmyshoulder / odeint-v2

odeint - solving ordinary differential equations in c++ v2
http://headmyshoulder.github.com/odeint-v2/
Other
344 stars 101 forks source link

Speed using complex system -vs- manually coding real/imaginary parts #91

Closed Rodneyh303 closed 9 years ago

Rodneyh303 commented 11 years ago

Great tool overall!

I'm implementing a 16-dimensional first order linear system of complex numbers, using runge_kutta_dopri5.

I've tried programming the state x and step dxdt using:

Using the complex state type takes about 7 times longer to execute than manually programming the real and imaginary parts.

Profiling shows a lot of time in do_step_impl( system , in , dxdt_in , t , out , dxdt_out , dt ); when using the complex type, compared to nearly all the time being spend in the operator()( const state_type &x , state_type &dxdt , double t ) of the Functor when manually programming real and imaginary parts.

Have you done any benchmarking of this type? Any thoughts how to speed up the use of a linear system of complex state types?

headmyshoulder commented 11 years ago

We did some benchmarking of similar tools and we did not so large differences between both methods. Can you give us more insights into your ODE?

Do you compile with all optimizations turned on?

On 06/21/2013 10:29 AM, Rodneyh303 wrote:

Great tool overall!

I'm implementing a 16-dimensional first order linear system of complex numbers, using runge_kutta_dopri5.

I've tried programming the state x and step dxdt using:

  • 16-d vector> using std::complex
  • 32-d vector using std::double.

Using the complex state type takes about 7 times longer than manually programming the real and imaginary parts.

Profiling shows a lot of time in do_step_impl( system , in , dxdt_in , t , out , dxdt_out , dt ); when using the complex type, compared to nearly all the time being spend in the operator()( const state_type &x , state_type &dxdt , double t ) of the Functor when manually programming real and imaginary parts.

Have you done any benchmarking of this type? Any thoughts how to speed up the use of a linear system of complex state types?

— Reply to this email directly or view it on GitHub https://github.com/headmyshoulder/odeint-v2/issues/91.

andre-bergner commented 11 years ago

Hi,

my experience with std::complex has been that it should not be used for numerical simulation where performance is crucial. I usually hand-craft the real/imag-equation when speed is important. Currently I'm working on a project for std::complex replacement based on expression templates. This type outperforms std::complex even for simple equations.

2013/6/21 Rodneyh303 notifications@github.com

Great tool overall!

I'm implementing a 16-dimensional first order linear system of complex numbers, using runge_kutta_dopri5.

I've tried programming the state x and step dxdt using:

  • 16-d vector> using std::complex
  • 32-d vector using std::double.

Using the complex state type takes about 7 times longer than manually programming the real and imaginary parts.

Profiling shows a lot of time in do_step_impl( system , in , dxdt_in , t , out , dxdt_out , dt ); when using the complex type, compared to nearly all the time being spend in the operator()( const state_type &x , state_type &dxdt , double t ) of the Functor when manually programming real and imaginary parts.

Have you done any benchmarking of this type? Any thoughts how to speed up the use of a linear system of complex state types?

— Reply to this email directly or view it on GitHubhttps://github.com/headmyshoulder/odeint-v2/issues/91 .

Rodneyh303 commented 11 years ago

Thanks for replying so quickly.

The ODE is just a 16-dimensional linear first order system.

The matrix of the system is basically a symmetric diagonal band matrix.

The numbers on the leading diagonal are all complex.

On each row, there are 4 more nonzero elements which are all real.

I compiled the 32-d manually managed real/imaginary system and the 16-d system of complex using the same optimisations.

(Windows 7 64, Intel C++ 2013, Parallel Studio, release mode, all optimisations).

In my update step, I only add up the nonzero elements, for speed.

I’m using OpenMP to spread the work across 4 cores. This gives a nice 3.5 times speedup in each case.

But the relativity remains with or without this parallelisation.

Manual management of the complex numbers is about 10 times faster than using std::complex

I can provide more details privately if it helps, I’m using ODEINT in my PhD research.

I need to evaluate the system thousands of times with different inputs.

From: headmyshoulder [mailto:notifications@github.com] Sent: Friday, 21 June 2013 6:36 PM To: headmyshoulder/odeint-v2 Cc: Rodneyh303 Subject: Re: [odeint-v2] Speed using complex system -vs- manually coding real/imaginary parts (#91)

We did some benchmarking of similar tools and we did not so large differences between both methods. Can you give us more insights into your ODE?

Do you compile with all optimizations turned on?

On 06/21/2013 10:29 AM, Rodneyh303 wrote:

Great tool overall!

I'm implementing a 16-dimensional first order linear system of complex numbers, using runge_kutta_dopri5.

I've tried programming the state x and step dxdt using:

  • 16-d vector> using std::complex
  • 32-d vector using std::double.

Using the complex state type takes about 7 times longer than manually programming the real and imaginary parts.

Profiling shows a lot of time in do_step_impl( system , in , dxdt_in , t , out , dxdt_out , dt ); when using the complex type, compared to nearly all the time being spend in the operator()( const state_type &x , state_type &dxdt , double t ) of the Functor when manually programming real and imaginary parts.

Have you done any benchmarking of this type? Any thoughts how to speed up the use of a linear system of complex state types?

— Reply to this email directly or view it on GitHub https://github.com/headmyshoulder/odeint-v2/issues/91.

— Reply to this email directly or view it on GitHub https://github.com/headmyshoulder/odeint-v2/issues/91#issuecomment-19804168 . https://github.com/notifications/beacon/-56tvogq0R9ZBJIiaCoYummObuBbjPOak0EG8FKc2riORGY31q7fNfvE_dJpely6.gif

Rodneyh303 commented 11 years ago

Thanks for this advice. I’m certainly finding manual crafting of real/imag to be about 10 times faster with everything else the same (optimisations, number of threads).

I need to evaluate the system thousands of times. I thought it would be neater to use the built in complex numbers as it makes the intent of the equations clearer.

Would be interested to hear about the relative performance of your replacement when you’ve done it.

Regards

Rodney

From: andre-bergner [mailto:notifications@github.com] Sent: Friday, 21 June 2013 6:54 PM To: headmyshoulder/odeint-v2 Cc: Rodneyh303 Subject: Re: [odeint-v2] Speed using complex system -vs- manually coding real/imaginary parts (#91)

Hi,

my experience with std::complex has been that it should not be used for numerical simulation where performance is crucial. I usually hand-craft the real/imag-equation when speed is important. Currently I'm working on a project for std::complex replacement based on expression templates. This type outperforms std::complex even for simple equations.

2013/6/21 Rodneyh303 notifications@github.com

Great tool overall!

I'm implementing a 16-dimensional first order linear system of complex numbers, using runge_kutta_dopri5.

I've tried programming the state x and step dxdt using:

  • 16-d vector> using std::complex
  • 32-d vector using std::double.

Using the complex state type takes about 7 times longer than manually programming the real and imaginary parts.

Profiling shows a lot of time in do_step_impl( system , in , dxdt_in , t , out , dxdt_out , dt ); when using the complex type, compared to nearly all the time being spend in the operator()( const state_type &x , state_type &dxdt , double t ) of the Functor when manually programming real and imaginary parts.

Have you done any benchmarking of this type? Any thoughts how to speed up the use of a linear system of complex state types?

— Reply to this email directly or view it on GitHubhttps://github.com/headmyshoulder/odeint-v2/issues/91 .

— Reply to this email directly or view it on GitHub https://github.com/headmyshoulder/odeint-v2/issues/91#issuecomment-19804870 . https://github.com/notifications/beacon/-56tvogq0R9ZBJIiaCoYummObuBbjPOak0EG8FKc2riORGY31q7fNfvE_dJpely6.gif

headmyshoulder commented 11 years ago

Hmm, then it seems to me, that you ODE is already quite fast and all the work is spent in the operations. If you are really brave you could try to write an own operations class for complex valued containers. In should be quite easy, if you use the default_operations as basis and split the operations into its real and imaginary part. But it might also be possible that special processor optimization can be more easily be turned on when using real valued ODE. Nevertheless, it would be interesting if operations for complex valued containers would be beneficial.

Do you use step size control?

On 06/21/2013 02:08 PM, Rodneyh303 wrote:

Thanks for this advice. I’m certainly finding manual crafting of real/imag to be about 10 times faster with everything else the same (optimisations, number of threads).

I need to evaluate the system thousands of times. I thought it would be neater to use the built in complex numbers as it makes the intent of the equations clearer.

Would be interested to hear about the relative performance of your replacement when you’ve done it.

Regards

Rodney

From: andre-bergner [mailto:notifications@github.com] Sent: Friday, 21 June 2013 6:54 PM To: headmyshoulder/odeint-v2 Cc: Rodneyh303 Subject: Re: [odeint-v2] Speed using complex system -vs- manually coding real/imaginary parts (#91)

Hi,

my experience with std::complex has been that it should not be used for numerical simulation where performance is crucial. I usually hand-craft the real/imag-equation when speed is important. Currently I'm working on a project for std::complex replacement based on expression templates. This type outperforms std::complex even for simple equations.

2013/6/21 Rodneyh303 notifications@github.com

Great tool overall!

I'm implementing a 16-dimensional first order linear system of complex numbers, using runge_kutta_dopri5.

I've tried programming the state x and step dxdt using:

  • 16-d vector> using std::complex
  • 32-d vector using std::double.

Using the complex state type takes about 7 times longer than manually programming the real and imaginary parts.

Profiling shows a lot of time in do_step_impl( system , in , dxdt_in , t , out , dxdt_out , dt ); when using the complex type, compared to nearly all the time being spend in the operator()( const state_type &x , state_type &dxdt , double t ) of the Functor when manually programming real and imaginary parts.

Have you done any benchmarking of this type? Any thoughts how to speed up the use of a linear system of complex state types?

— Reply to this email directly or view it on GitHubhttps://github.com/headmyshoulder/odeint-v2/issues/91 .

— Reply to this email directly or view it on GitHub https://github.com/headmyshoulder/odeint-v2/issues/91#issuecomment-19804870 . https://github.com/notifications/beacon/-56tvogq0R9ZBJIiaCoYummObuBbjPOak0EG8FKc2riORGY31q7fNfvE_dJpely6.gif

— Reply to this email directly or view it on GitHub https://github.com/headmyshoulder/odeint-v2/issues/91#issuecomment-19812139.

Rodneyh303 commented 11 years ago

Yes I'm using step size control: typedef runge_kutta_dopri5< state_type > error_stepper_type; .. integrate_adaptive(make_controlled< error_stepper_type >(1.0e-6,1.0e-6), std::ref(*odestepper) , ode_template , 0.0 , Time , 0.1 );

When I manually program the real/imag, nearly all the time is spent in my odestepper update function and it seems highly optimised. When I use std::complex type, a lot of overhead time is spent in do_step_impl( system , in , dxdt_in , t , out , dxdt_out , dt );

I haven't yet looked in detail at the custom algebras and operations, but it sounds like it should be possible to implement a faster container of complex as you suggest.