pgiri / dispy

Distributed and Parallel Computing Framework with / for Python
https://dispy.org
Other
260 stars 55 forks source link

Distribute individual scripts to cluster #186

Open fab6 opened 5 years ago

fab6 commented 5 years ago

Hi, I have a question about the suitability of dispy for my task:

For optimization work I am applying dakota which has different options to handle parallel optimization. With my current approach it creates a given number of multiple scripts each with different parameters. The created python script runs then an external simulation program. And the optimizer waits until all calculations are finished and finds a new set of parameters and creates new scripts. This works fine on a local machine.

Now I would like to distribute the running of each external program on the cluster.

So basically the structure looks like this for the scripts with the external commands:

./designs/script1 ./designs/script2 ... ./designs/scriptN

e.g. with availabl cores given by a machine file: node01 cpu=12 node02 cpu=12

Within each script1 to scriptN there should be a function which looks for free resources and runs the system command on the available nodes.

Is dispy suited for this at all? I could imagine that the client/server might work for this? Thanks! Fabian

pgiri commented 5 years ago

You can sub-class DispyNodeAllocate to allocate one CPU per node (or start dispynode with --cpus 1) so dispy will schedule at most one job even if node has many CPUs (I assume that is what you want).

You can also send scripts with the above (with cluster.send_file method). Alternately, you can use cluster_callback to handle DispyNode.Initialized where files can be sent to that node.

fab6 commented 5 years ago

Hi,

thank you for the quick help! I looked with your comments a bit deeper into the documentation now, this helped a lot.

Actually I need to allocate several separate scripts to one node but with multiple cores, e.g. when I have the nodes (each have 12 cores): node01 cpu=12 node02 cpu=12

I would like to run 24 separate scripts at the start and once these are finished I will start 24 new ones based on information from the first calculations and the new one will not need to wait for all 24 scripts to be finished. I expect that dispy-server handles the distribution of the running of the new scripts.

My understanding is now, that I need to launch a dispynode.py or a dispyscheduler.py with the name of the nodes on my workstation.

In my above computing scripts I would basically add a "JobCluster" similar to:

`import dispy cluster = dispy.JobCluster('/path/to/my scripts or function within python') for i in range(50): cluster.submit(i)

Now, I am not sure if this is the right path especially as did not work out yet... e.g. one of the problems seems to be that the port is blocked.

The sending and allocating might not be needed in my case, as the directory is mounted among all nodes and my workstation, I forgot to mention this.

Thank you! `

pgiri commented 5 years ago

I don't think I understand your question on how to implement, but my guess is you need to execute each job with different script? If so,

def compute(path):
    return process_file(path)

if __name__ == '__main__':
    import dispy
    cluster = dispy.JobCluster(compute)
    for i in range(50):
        cluster.submit('/path/to/script%s' % i)
    cluster.wait()