sailfish-team / sailfish

Lattice Boltzmann (LBM) simulation package for GPUs (CUDA, OpenCL)
http://sailfish.us.edu.pl
234 stars 85 forks source link

Add MPI Support for distributed simulation #22

Open Mokosha opened 11 years ago

Mokosha commented 11 years ago

I've just downloaded the project and ran some of the examples. Looks good!

I'm interested in figuring out if it's plausible to support message passing (MPI) in order to set up distributed simulations. I wasn't able to figure out any other way to get in touch with you other than opening an issue, so if there was a better medium within which this communication can take place let me know. =)

From my (limited) investigation of the docs and code, it seems like distributed simulation works by having two nodes that can connect to each other over the network. At our institution we have a supercomputer where nodes communicate via MPI. There is a python project to support MPI in this way (http://pympi.sourceforge.net/), but I wasn't sure what the best way to go about adding it to sailfish is.

Any thoughts?

mjanusz commented 10 years ago

Somehow this escaped my attention when it was filed, sorry about that :(

While Sailfish does not integrate with MPI directly, it does support distributed simulations as you noted (it takes care of node-to-node communication internally using zmq). Does your supercomputer have some sort of batch queueing system, such as PBS or LSF? If so, Sailfish can take advantage of these to start jobs on multiple nodes. If not, and if you have ssh access to all nodes, an alternative that should work with the current version of the code would be to build a cluster specification file manually. If none of the above work and if this is still of interest, we could also look into adding some sort of MPI support.

The best way to contact people about the project is usually via the sailfish-cfd mailing list (https://groups.google.com/forum/#!forum/sailfish-cfd).