Open ma-sadeghi opened 2 years ago
Hey @mkaguer! Thanks for the thorough analysis. I think we could do better in problems where the A matrix is not changing (those without a nonlinear source term for instance). Currently, we're rebuilding A at every time step, but we could use a cached version instead. Could you maybe look into it and see if you get significant speed-ups this way?
Hi @ma-sadeghi. Sure, I gave it a try. The previous problem I was testing was a multiphysics problem. That required A NOT to be cached since the conductance models needed to be updated. So for the purpose of comparing cached versus not cached, I wrote a new script for a transient fickian diffusion problem only. Geometry was kept the same, 10,000 pores. These are the results:
Cached: atol, rtol = 1e-3 scipy time: 1.08211088180542s custom time: 1.3339719772338867s
NOT Cached: atol, rtol = 1e-3 scipy time: 1.5253725051879883s custom time: 1.6519362926483154s
Speed up between 20 to 30% when A and b are cached.
This is much faster then previous multiphysics problem. So, for interest sake, I increased the domain size to 20,000 pores. This will give us the same number of unknowns then as the multiphysics problem. Result:
Cached: atol, rtol = 1e-3 scipy time: 2.5668303966522217s custom time: 2.523296356201172s
NOT Cached: atol, rtol = 1e-3 scipy time: 3.173037052154541s custom time: 3.27321457862854s
Wow, still a lot faster then before. So what is making the solver slow must be the multiphysics!
Interesting, although I was expecting much more speed up to be gained. How do you cache the A matrix? Could you post a little snippet of the code where you modified?
I just changed the algorithm settings before running
I see. Could you time coo-to-csc conversion?
You could add tic()
and toc()
before and after lines 116 as well as 117 (basically, I want to know how the conversion compares with updating A and b, you also need to do from openpnm.utils import tic, toc
).
Hi @ma-sadeghi. Ok so I ran my Fickian diffusion simulation again (no multiphysics). I compared the conversation time to updating A and b time. I did this for cached and not cached.
When cached Updating A and b ~6 to 10 ms Conversion to CSC ~ 1 ms
When NOT cached Updating A and b ~8 to 12 ms Conversion to CSC ~ 0 ns to 2.0 ms
@mkaguer Could you also report the timings for a larger network?
Hi @ma-sadeghi. For a larger network of 1000 by 1000 pores these are the results. Previous network was 10,000 pores. Now the network is 1,000,000 pores. There is still something happening when updating A and b is called that seems to take a significant amount of time yet.
When cached Updating A and b ~ 800 ms Conversion to CSC ~ 100 ms
When NOT cached Updating A and b ~ 1.15s Conversion to CSC ~ 100 ms
Ok @ma-sadeghi, also here are the results of scaling the temperature. The times did not vary much for the multiphysics problem.
Temperature NOT scaled: scipy time: 81.07932877540588 custom time: 90.1770327091217
Temperature scaled down by factor of 10 scipy time: 98.22583055496216 custom time: 106.6029839515686
Temperature scaled down by factor of 100 scipy time: 98.06760907173157 custom time: 105.54824447631836
Temperature scaled down by factor of 1000 scipy time: 96.27277278900146 custom time: 103.60037040710449
Temperature scaled down by factor of 10000 scipy time: 97.26129722595215 custom time: 90.88044333457947
@mkaguer Thanks for the comprehensive report! So, what's going on in the voltage problem that makes the simulation slower?
I investigated this a bit myself. I wrote myself a custom RK45 solver. The simulation time for my custom solver was similar but usually more). I tried with varying amounts of tolerance (see results below). The system I tried on was a fickian diffusion problem coupled with a fourier conductance problem with temperature dependent diffusivity. The domain was 10,000 pores.
Also, when I used these RK45 solvers on the Nernst Planck system they were extremely slow! But I realized the issue was not with the solver but how the problem was defined. I was using the electroneutrality model for ionic conductance which admittedly has it's issues. But changing to poisson, the solvers worked much faster!! So sometimes it is just how the problem is defined. A poorly defined problem, convergence will be hard to reach, causing the solver to move very slow with very small time steps.
In conclusion, two things the user can do to help with speed: 1) ensure the problem is defined properly so that it converges okay 2) adjust the error tolerance, although this is not showing to have a huge effect on simulation time as I would have expected.
atol=1e-6, rtol=1e-6 scipy: 105.24267768859863s custom: 102.51243782043457s
atol=1e-6, rtol=2.220446049250313e-14 scipy: 90.67205548286438s custom: 110.18533158302307s
atol=1e-3, rtol=2.220446049250313e-14 scipy: 97.23677921295166s custom: 100.42027807235718s
atol=1e-3, rtol=1e-6 scipy: 90.96914172172546s custom: 100.34637427330017s
atol=1e-3, rtol=1e-3 scipy: 79.50835132598877s custom: 87.2422399520874s
atol=1e-2, rtol=1e-2 scipy: 81.8239598274231s custom: 94.40269660949707s
atol=1e-1, rtol=2.220446049250313e-14 scipy: 78.02460741996765s custom: 87.64419627189636s