roryk / ipython-cluster-helper

Tool to easily start up an IPython cluster on different schedulers.
148 stars 23 forks source link

Feature idea: starcluster scheduler #21

Closed twiecki closed 8 years ago

twiecki commented 9 years ago

Starcluster is a really neat package to bring up a cluster on ec2 with support for IPython parallel. ipy-cluster-helper can be launched inside an existing starcluster but it would be even cooler if ipy-cluster-helper could launch starcluster itself (as a scheduler).

alexbw commented 9 years ago

This would be really neat. Basically, on-demand clusters. Since picloud died, I don't think there's anything like this.

I'm not sure it would be that difficult either — you'd just have to spin up a cluster using the starcluster library with the IPython parallel plugin, and then grab a reference to the IPython parallel cluster using the auto-generated profile file. Then, when the computation finishes, kill the cluster. Since it can take a long time to spin up a cluster, possibly provide an option to persist the instances and cluster after the context finishes, but perhaps wipe any written data.

chapmanb commented 9 years ago

Thomas and Alex; Thanks for all the suggestions. We've been doing some work with AWS and bcbio (which uses ipython-cluster-helper for all the parallelization) using elasticluster (https://github.com/gc3-uzh-ch/elasticluster) which presents a starcluster-like interface. The main advantage for us is that you don't need specific AMIs so can bootstrap everything from base Ubuntu/Linux AMIs. The approach we use is documented here if that's of interest: https://bcbio-nextgen.readthedocs.org/en/latest/contents/cloud.html

If either of you tackles this with starcluster it would be great to have docs/pointers on how to do it to make it easier for others. Thanks again.

twiecki commented 9 years ago

@chapmanb Thanks for your response. Actually the main motivation for starcluster is to be able to spawn an ec2 cluster so any solution to this problem would be a great feature. Are there plans to include the elasticluster interface to the ipython-cluster-helper?

chapmanb commented 9 years ago

Thomas; Unfortunately that's a bit beyond our design goals with ipython-cluster-helper. This is a small bit of glue meant to provide a small abstraction on top of IPython parallel to support cluster specific options and lots of different cluster types through one interface.

It seems like bcbio-vm should do what you need if you want to spin up AWS clusters that have a Python pre-configured with IPython and ipython-cluster-helper. There is no need to use the bcbio-specific parts if you don't need them. You can start a cluster '~/install/bcbio-vm/anaconda/bin/python` will have IPython/ipython-cluster-helper pre-installed and ready to go. Hope this works for what you need.

roryk commented 8 years ago

Thanks for the comments @teiecki, hope you ended up getting something going. Cleaning out some old issues so closing this. Feel free to reopen if Brad's suggestion didn't get you to where you needed to be.