ACEsuit / IPFitting.jl

Fitting of NBodyIPs
Other
5 stars 10 forks source link

DistributedArrays.jl #17

Closed casv2 closed 2 years ago

casv2 commented 3 years ago

Collecting some findings here, DistributedArrays.jl seems to work quite nice.

using Distributed, DistributedArrays
addprocs(2)
@everywhere using DistributedArrays
a=dzeros((3,4),workers())
@sync @distributed for i = 1:nworkers()
    a_part = localpart(a) 
    vec(a_part) .= (1:length(a_part)) .+ 1000*myid()
end

I think we'll just have to create some new struct LsqDB_dist or something? And then populate the 𝚿 matrix using a modified safe_append using a distributed for loop like above? Perhaps a different LsqDB() function for the call, without the option of saving.

cortner commented 3 years ago

I've been thinking about this actually. I'm actually not entirely sure. If we start creating multiple backends then I'm not sure we are using Julia dispatch correctly. What would be great is if everything could "just work" :).

But another perspective might be to drop the multi-threading entirely and move everything to distributed arrays. That way we have just one code-path to maintain. I do wonder whether this might be the way to go actually. We could start by merging all your recent work into master, creating a new releast, then create a new branch and try to rewrite everything with distributed arrays. What do you think?

casv2 commented 3 years ago

Sounds good, but I should clean up the solvers first, we tried about a fair few but some are not worth it to make it to a new release.

casv2 commented 3 years ago

Also I agree about perhaps moving everything to distributed arrays. This DistributedArrays.jl-like implementation above seems to work quite well, I also think it should not be too hard to rewrite IPFitting to only use distributed arrays actually...

However, it would require rewriting the read/write of the LsqDBs too. Should be possible, but I think we should keep this interface. Maybe we can keep this "old" interface for reading/writing, and have a separate distributed arrays implementation for massive fits.

cortner commented 2 years ago

I think this is now in #23, so I'll close it, please reopen if this is separate and both should be kept.