LupoLab / Luna.jl

Nonlinear optical pulse propagator
MIT License
58 stars 27 forks source link

Adding shared- and distributed-memory parallelization? #294

Open millerk123 opened 2 years ago

millerk123 commented 2 years ago

Hello,

We are interested in extending this software to include shared- and distributed-memory parallelization for execution of large problems on many CPUs. Before looking into this, I was wondering if you are aware of any serious roadblocks or challenges to doing this. Is it something that in principle should be doable?

Thanks, Kyle

chrisbrahms commented 2 years ago

Hi Kyle,

I don't think there are any serious roadblocks. For the specific case of multi-mode guided simulations, there is even the long-languishing pull request #159 which implements this. So far we haven't bothered much with parallelising individual simulations, because most of the time we want to run many smaller and completely independent propagations rather than a single huge one, and of course parallelising that through multiple processes is trivial. See here for example: https://github.com/LupoLab/Luna.jl/blob/6a45bbe98e66b37007b462139ef47315c83d2f39/src/Scans.jl#L62

When you say "large problem", what do you mean? Specifically, are we talking about guided or free-space geometry?

millerk123 commented 2 years ago

Great to hear, and thanks for pointing me to that pull request.

For the large problems, we are interested in free-space propagation over multiple meters in a situation where the power can be much larger than the critical power and harmonic generation will be important

jtravs commented 2 years ago

That PR also parallelised free space propagation too. But it needs updating and testing.

chrisbrahms commented 2 years ago

That PR also parallelised free space propagation too. But it needs updating and testing.

So it does. Sorry, been a while since I looked at that code 😁

jtravs commented 2 years ago

To be honest, this is a rather trivial amount of code to update. There were issues with the performance scaling of the free-space code, but since Julia introduced dynamic scheduling (see here https://docs.julialang.org/en/v1/base/multi-threading/) it may have improved much more.

millerk123 commented 2 years ago

Is the implementation in PR #159 for multi-threading or distributed computing?

jtravs commented 2 years ago

It is for multi-threading. It would need a bit of work for distributed computing, but should be possible. But the code is heavily FFT dependent, which doesn't scale brilliantly with distributed computing.