pydoit / doit

CLI task management & automation tool
http://pydoit.org
MIT License
1.87k stars 175 forks source link

(RFE) support OpenPBS cluster schedulers #172

Open ankostis opened 7 years ago

ankostis commented 7 years ago

For big-data it would really help doit it supported the submition of tasks using cluster-schedulers like PBS , SLURM, Torque, Sun Grid Engine (SGE) & other qsub like services.

For a similar feature, I would look at SnakeMake.

Since I recently I've been granted access to a Torque cluster, I may provide some ideas on this issue in the future.

Fund with Polar

goerz commented 7 years ago

I feel this might push pydoit a little far outside its core focus. Moreover, actions in pydoit can be arbitrary python callables, which can't be directly pushed to a cluster engine.

Instead, this functionality could be covered by a separate package. I've written clusterjob, which provides a Python wrapper around a whole bunch of schedulers (including PBS and Slurm). I could easily see a pydoit job constructing and submitting a JobScript. I might actually play around with that in the near future. Sorry the clusterjob documentation isn't as complete with examples yet as it should be, but the latest version of the package from github should let you easily create scheduled jobs, if you want to give it a try.

schettino72 commented 7 years ago

I am not familiar with how clusters work. That is not the first time the topic is raised...

doit has a pluggable "task runner", see for example how multiprocessing works. So I guess it should be doable.

@goerz thanks. let me know if you need any help on doit side required to integrate with your tool.

goerz commented 7 years ago

So I played around with this during the last week, and it seems to work pretty well. I've added a writeup to the clusterjob documentation: http://clusterjob.readthedocs.io/en/latest/pydoit_pipeline.html

This is for the latest dev-version of clusterjob, so you'd have to install it via

pip install git+https://github.com/goerz/clusterjob.git@develop#egg=clusterjob

for any of this to work.

schettino72 commented 7 years ago

@goerz cool, thanks. I took a quick but would need to play around as i have no experience working with clusters... I guess some parts of example could be extracted and provided as helper functions to make easier for end-users to use tasks on clusters.

It would be nice if you added a section on doit docs pointing to your tool and related docs.