Transient solvers are slow

mkaguer commented 2 years ago

I investigated this a bit myself. I wrote myself a custom RK45 solver. The simulation time for my custom solver was similar but usually more). I tried with varying amounts of tolerance (see results below). The system I tried on was a fickian diffusion problem coupled with a fourier conductance problem with temperature dependent diffusivity. The domain was 10,000 pores.

Also, when I used these RK45 solvers on the Nernst Planck system they were extremely slow! But I realized the issue was not with the solver but how the problem was defined. I was using the electroneutrality model for ionic conductance which admittedly has it's issues. But changing to poisson, the solvers worked much faster!! So sometimes it is just how the problem is defined. A poorly defined problem, convergence will be hard to reach, causing the solver to move very slow with very small time steps.

In conclusion, two things the user can do to help with speed: 1) ensure the problem is defined properly so that it converges okay 2) adjust the error tolerance, although this is not showing to have a huge effect on simulation time as I would have expected.

atol=1e-6, rtol=1e-6 scipy: 105.24267768859863s custom: 102.51243782043457s

atol=1e-6, rtol=2.220446049250313e-14 scipy: 90.67205548286438s custom: 110.18533158302307s

atol=1e-3, rtol=2.220446049250313e-14 scipy: 97.23677921295166s custom: 100.42027807235718s

atol=1e-3, rtol=1e-6 scipy: 90.96914172172546s custom: 100.34637427330017s

atol=1e-3, rtol=1e-3 scipy: 79.50835132598877s custom: 87.2422399520874s

atol=1e-2, rtol=1e-2 scipy: 81.8239598274231s custom: 94.40269660949707s

atol=1e-1, rtol=2.220446049250313e-14 scipy: 78.02460741996765s custom: 87.64419627189636s

ma-sadeghi commented 2 years ago

Hey @mkaguer! Thanks for the thorough analysis. I think we could do better in problems where the A matrix is not changing (those without a nonlinear source term for instance). Currently, we're rebuilding A at every time step, but we could use a cached version instead. Could you maybe look into it and see if you get significant speed-ups this way?

mkaguer commented 2 years ago

Hi @ma-sadeghi. Sure, I gave it a try. The previous problem I was testing was a multiphysics problem. That required A NOT to be cached since the conductance models needed to be updated. So for the purpose of comparing cached versus not cached, I wrote a new script for a transient fickian diffusion problem only. Geometry was kept the same, 10,000 pores. These are the results:

Cached: atol, rtol = 1e-3 scipy time: 1.08211088180542s custom time: 1.3339719772338867s

NOT Cached: atol, rtol = 1e-3 scipy time: 1.5253725051879883s custom time: 1.6519362926483154s

Speed up between 20 to 30% when A and b are cached.

This is much faster then previous multiphysics problem. So, for interest sake, I increased the domain size to 20,000 pores. This will give us the same number of unknowns then as the multiphysics problem. Result:

Cached: atol, rtol = 1e-3 scipy time: 2.5668303966522217s custom time: 2.523296356201172s

NOT Cached: atol, rtol = 1e-3 scipy time: 3.173037052154541s custom time: 3.27321457862854s

Wow, still a lot faster then before. So what is making the solver slow must be the multiphysics!

ma-sadeghi commented 2 years ago

Interesting, although I was expecting much more speed up to be gained. How do you cache the A matrix? Could you post a little snippet of the code where you modified?

mkaguer commented 2 years ago

I just changed the algorithm settings before running

ma-sadeghi commented 2 years ago

I see. Could you time coo-to-csc conversion?

https://github.com/PMEAL/OpenPNM/blob/197f55a265339c7c9f08b0a93b571a5c740dd602/openpnm/algorithms/_transient_reactive_transport.py#L113-L122

You could add tic() and toc() before and after lines 116 as well as 117 (basically, I want to know how the conversion compares with updating A and b, you also need to do from openpnm.utils import tic, toc).

mkaguer commented 2 years ago

Hi @ma-sadeghi. Ok so I ran my Fickian diffusion simulation again (no multiphysics). I compared the conversation time to updating A and b time. I did this for cached and not cached.

When cached Updating A and b ~6 to 10 ms Conversion to CSC ~ 1 ms

When NOT cached Updating A and b ~8 to 12 ms Conversion to CSC ~ 0 ns to 2.0 ms

ma-sadeghi commented 2 years ago

@mkaguer Could you also report the timings for a larger network?

mkaguer commented 2 years ago

Hi @ma-sadeghi. For a larger network of 1000 by 1000 pores these are the results. Previous network was 10,000 pores. Now the network is 1,000,000 pores. There is still something happening when updating A and b is called that seems to take a significant amount of time yet.

When cached Updating A and b ~ 800 ms Conversion to CSC ~ 100 ms

When NOT cached Updating A and b ~ 1.15s Conversion to CSC ~ 100 ms

mkaguer commented 2 years ago

Ok @ma-sadeghi, also here are the results of scaling the temperature. The times did not vary much for the multiphysics problem.

Temperature NOT scaled: scipy time: 81.07932877540588 custom time: 90.1770327091217

Temperature scaled down by factor of 10 scipy time: 98.22583055496216 custom time: 106.6029839515686

Temperature scaled down by factor of 100 scipy time: 98.06760907173157 custom time: 105.54824447631836

Temperature scaled down by factor of 1000 scipy time: 96.27277278900146 custom time: 103.60037040710449

Temperature scaled down by factor of 10000 scipy time: 97.26129722595215 custom time: 90.88044333457947

ma-sadeghi commented 2 years ago

@mkaguer Thanks for the comprehensive report! So, what's going on in the voltage problem that makes the simulation slower?

PMEAL / OpenPNM

Transient solvers are slow #2225