ewels / clusterflow

A pipelining tool to automate and standardise bioinformatics analyses on cluster environments.
https://ewels.github.io/clusterflow/
GNU General Public License v3.0
97 stars 27 forks source link

Support for pymodules? #9

Closed adrianreich closed 9 years ago

adrianreich commented 9 years ago

Hello,

I administer a SLURM cluster at a small research institute and I would love to use clusterflow to start standardizing our pipeline runs. That said, we use pymodules (https://web3.ccv.brown.edu/mhowison/pymodules/) instead of modules to manage our software. As far as I can gather, the syntax are nearly identical but there are some additional benefits to pymodules that I would hate to give up. Do you think that support for pymodules could be done? Thank you very much.

Sincerely, Adrian Reich

ewels commented 9 years ago

Hi Adrian,

I've not heard of pymodules before - looks nice!

Cluster Flow uses environment modules by loading the perl code required to update the environment with the command modulecmd perl load $mod (see code). This uses a seemingly little known function of environment modules - outputting Perl and Python commands as well as the usual bash commands (modulecmd [lang] load [mod]).

Running the normal bash command directly from Perl to modify the environment variables in the shell doesn't work, as this doesn't modify the environment of the Perl script (as this is already running). As it's the Perl script that fires the cluster jobs, these also don't get the modified path.. You have to modify the environment within the Perl script. Took me a while to figure this out :)

Anyway, in the short term, you can simply disable Cluster Flow's support of the environment module system by including @ignore_modules in your config file (see the docs). This will stop CF from complaining, but means that you'll have to make sure that all of the required programs are added to your environment path before firing off Cluster Flow - in other words, you need to use pymodules manually before you use CF. If you make sure that everything you need is being loaded within your ~/.bashrc script then you need not worry about this I guess.

In the longer term, we need a fairly in-depth look at how pymodules works. Looking at it's source code it looks like it collects the required paths and versions, adds these to an environment object and then finally prints this out as a bash command. This is exactly analogous to how module commands work. However, there is no support for Perl or Python. It could be possible to run pymodules from within Cluster Flow and pull out what is needed from this bash string - this is actually what I did before I found out about the Perl / Python output options of modulecmd (see code). This wasn't great though and didn't always work.

It might be nicer to add Perl (and/or Python) as new supported shell types within the pymodules code. This probably won't be super easy, but should be entirely doable. It looks like compatibility is already built in for both bash and csh, so it'll just be a case of extending this.

I hope all of this makes sense! Would you be interested in extending Pymodules? I'm not very excited about reintroducing my hacky code to Cluster Flow if I'm honest, though of course you're welcome to modify your own version.

Phil

adrianreich commented 9 years ago

Thank you very much for the very informative reply. I did not get it all but I did get the gist of it. I am going to contact the people who wrote pymodules to see if this is something that they would be interested in doing. If not, I would be fine with having users run a quick script to make sure that all of the dependencies are loaded prior to firing up CF. Thank you again.

Also, please keep releasing the videos on youtube, they are very well done and it is how I discovered the power of a number of your projects including Labrador and fastqc.

EDIT: I opened an issue over at pymodules here: https://bitbucket.org/mhowison/pymodules/issue/5/ability-to-modify-environment-of-python

ewels commented 9 years ago

Great, let me know how you get on.. Glad you like the videos too, they're a bit of a pain to make but I agree, a (moving) picture is worth a thousand words!

Cheers,

Phil