Closed peterjc closed 9 years ago
+1
fre. 18. sep. 2015, 13.39 skrev Peter Cock notifications@github.com:
Related to #19 https://github.com/galaxyproject/planemo/issues/19 (testing tool_dependencies.xml without a tool shed), I would like to be able to run an install recipe from a tool_dependencies.xml file locally and/or turn it into a simple shell script for the current platform.
(The platform specific actions could be turned into bash if statements if preferred)
This seems to overlap with https://github.com/jmchilton/shed2tap
e.g. https://github.com/peterjc/pico_galaxy/blob/master/tools/effectiveT3/tool_dependencies.xml
<?xml version="1.0"?>
$INSTALL_DIR wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_GUI-1.0.1.jar $INSTALL_DIR/module wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_ANIMAL-1.0.1.jar $INSTALL_DIR/module/ wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_PLANT-1.0.1.jar $INSTALL_DIR/module/ wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_STD-1.0.1.jar $INSTALL_DIR/module/ wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_STD-2.0.1.jar $INSTALL_DIR/module/ Downloads effectiveT3 v1.0.1 and the three models from http://effectors.org/ aka http://effectors.csb.univie.ac.at/ Would become something like this:
!/bin/bash
House keeping: strict bash mode, etc
set -euo pipefail
TODO - move to a temp dir, check $INSTALL_DIR is set and exists
Start of conversion from XML recipe:
echo "Installing effectiveT3 version 1.0.1" wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_GUI-1.0.1.jar mv TTSS_GUI-1.0.1.jar $INSTALL_DIR/ mkdir $INSTALL_DIR/module wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_ANIMAL-1.0.1.jar mv TTSS_ANIMAL-1.0.1.jar $INSTALL_DIR/module/ wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_PLANT-1.0.1.jar mv TTSS_PLANT-1.0.1.jar $INSTALL_DIR/module/http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_STD-1.0.1.jar mv TTSS_STD-1.0.1.jar $INSTALL_DIR/module/ wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_STD-2.0.1.jar mv TTSS_STD-2.0.1.jar $INSTALL_DIR/module/
I would then be able to run this within TravisCI with the advantages that the install recipe is not repeated (tool_dependencies.xml and .travis.yml) and moreover I would actually be able to test tool_dependencies.xml, e.g. peterjc/pico_galaxy@243311c https://github.com/peterjc/pico_galaxy/commit/243311cc50ab2675c5e6aa42524841e60e6602e8
— Reply to this email directly or view it on GitHub https://github.com/galaxyproject/planemo/issues/303.
Alternatively there is the shed2tap code if installing from a brew recipe would be acceptable
ping @davebx; As far as I know he was looking at this already. We had this idea some month ago to make migration to brew or whatever we will use easier.
I'm out of time now, but having spent some time this afternoon hacking https://github.com/jmchilton/shed2tap I think I can turn @jmchilton's Action.to_ruby()
method into something to produce a bash script.
That might be enough for a stand alone tool, or a new planemo command - but waiting to hear from @davebx etc about how best to proceed to avoid duplication of effort.
Just a heads up (maybe way to late), the newest shed2tap code is actually in planemo itself. https://github.com/galaxyproject/planemo/blob/master/planemo/shed2tap/base.py.
Thanks @jmchilton. https://github.com/galaxyproject/planemo/blob/master/planemo/shed2tap/base.py has an extensive to_ruby()
method on the base Action
class (essentially a large switch statement), but there is nothing similar on https://github.com/galaxyproject/planemo/blob/master/planemo/shed2tap/base.py which instead has a far more complete heirachy of Action
subclasses. I would think adding small to_ruby()
or to_bash()
methods to each Action
subclasses would make sense here?
This might be the most updated thing I was working on... https://github.com/jmchilton/planemo/commit/52f78665d7c2eece73fdcce60a9294638856bf86. https://github.com/jmchilton/planemo/commits/shed2tap
Whatever you get working is fine. My code is sprawled all over it seems and that is my own fault so I will adapt it to whatever you get into planemo :).
I was thinking implementing a visitor pattern for ruby/bash conversion - but to_bash
or to_ruby
will be find also.
My first attempt is using to_bash
on the action classes...
My first example for effectiveT3 seems to work - but that is a simple tool_dependencies.xml
file, perhaps unusually simple.
The action type download_by_url
and friends is proving tricky. The problem is the Galaxy magic in lib/tool_shed/galaxy_install/tool_dependencies/recipe/step_handler.py
class CompressedFile
where the .extract
method will work out the common prefix of a tar-bar's contents in order to change into that directory. e.g.
<action type="download_by_url">ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.2.30/ncbi-blast-2.2.30+-x64-linux.tar.gz</action>
should become:
$ wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.2.30/ncbi-blast-2.2.30+-x64-linux.tar.gz
$ tar -zxvf ncbi-blast-2.2.30+-x64-linux.tar.gz
$ cd ncbi-blast-2.2.30+
I'm almost wondering if something like this would be simplest (which can call the same Galaxy code):
$ planemo download_by_url ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.2.30/ncbi-blast-2.2.30+-x64-linux.tar.gz
Another niggle, consider a recipe which boils down to something like this:
#!/bin/bash
#downloads stuff, then sets a new environment variable like NEW_TOOL,
#or edits an exiting environment variable like PATH
export NEW_TOOL=/some/path
If executed directly like ./example.sh
or bash example.sh
then we loose access to the environment variable $NEW_TOOL
. Alternatively source example.sh
or . example.sh
work in terms of exposing the new/changed environment variables, but they make exit
(or failures if using strict bash mode with set -euo pipefail
or similar) terminate the user's shell session.
I think this means we need to turn the tool_dependencies.xml
file(s) into an install.sh
file (or similarly named file) to be run once, plus a second shell script which only sets the environment variables, to be run via source
prior to running the tool tests via the dependency mechanisms. Galaxy calls those env.sh
, doesn't it?
@peterjc I like your second idea. And yes Galaxy calls the env.sh
files before executing a tool.
Thanks for working on this. I think this will make so much things easier.
I'm increasingly finding I am reimplementing things already in the Galaxy Tool Shed code (with the risk of potentially interpreting the XML recipe slightly differently, which on the plus side could highlight some ambiguities in the recipe format).
e.g. turning the environment variable actions into env.sh
entries is done in https://github.com/galaxyproject/galaxy/blob/dev/lib/tool_shed/galaxy_install/tool_dependencies/recipe/env_file_builder.py
Planemo already bundles part of the Galaxy python library under planemo_ext/
so might adding planemo_ext/tool_shed/galaxy_install/tool_dependencies/recipe/env_file_builder.py
etc might be a practical way forward?
@peterjc That directory aims to be a subset of the Galaxy's code base, feel free to bring stuff over. The stuff should be sufficiently isolated though. galaxy.util also isn't yet a true subset so be careful about that as well.
I have a plan for the action type download_by_url
etc actions consistent with producing as simple as possible a bash script.
While generating the bash script, I will download the file (to a temp directory by default) where I can examine it to determine how to decompress it and what folder (if any) Galaxy would automatically change into. This might use a bundled copy of lib/tool_shed/galaxy_install/tool_dependencies/recipe/step_handler.py
.
To avoid the overheads and waste of repeated downloads, the key information (decompression method and folder to change into) can be cached. I am planning to use the MD5 hash of the URL as the key. e.g. "~/.planemo/dependency_downloads/%s.json" % md5(url)
In the context of continuous integration with TravisCI, I plan to re-use the cached downloaded files. i.e. include an if statement to link to the cached file if present.
I have a working prototype planemo depbash
command here: https://github.com/peterjc/planemo/tree/depbash
This is hard coded to produce a single file dep_install.sh
and matching env.sh
combining all the tool_dependencies.xml
files processed (you can recurse over a folder) which all will use $INSTALL_DIR
as their destination (which is a problem if you have name clashes between tool binaries, e.g. multiple versions of BLAST+).
Right now it uses a single flat folder $DOWNLOAD_CACHE
(defaulting to ./download_cache
) to cache downloads (nothing clever with checksums), so that the decompression and folder structure can be determined while generating the shell script. The dep_install.sh
will also use this cache so that in a continuous integration setup the file is only fetched once.
Example usage assuming you don't have to worry about multiple tools clashing:
$ planemo depbash -r ~/my_tools/
$ bash dep_install.sh
$ source env.sh
$ planemo test -r ~/my_tools
Note this does nothing about resolving dependencies!
This is able to parse all my tool_dependencies.xml
in https://github.com/peterjc/galaxy_blast , https://github.com/peterjc/pico_galaxy and https://github.com/peterjc/galaxy_mira
Not all the actions are supported yet, e.g.
$ planemo depbash --fail_fast -r ../tools-iuc ../tools-devteam/ ; echo "Returned $?"
...
Processing requirements from /mnt/galaxy/repositories/tools-iuc/packages/package_abyss_1_9_0/tool_dependencies.xml
Downloading https://github.com/bcgsc/abyss/releases/download/1.9.0/abyss-1.9.0.tar.gz
Error processing /mnt/galaxy/repositories/tools-iuc/packages/package_abyss_1_9_0/tool_dependencies.xml - No to_bash defined for Action[type=set_environment_for_install]
...
Error processing one or more tool_dependencies.xml files.
Returned 1
@peterjc awesome!
Successful TravisCI usage with galaxy_mira
to install MIRA 3.4.1.1, 4.0.2 and 4.9.5 via planemo depbash
rather than a manual install recipe:
https://github.com/peterjc/galaxy_mira/commit/b71e8a49cf06c173916f408f2077a59ad7b003c5 https://travis-ci.org/peterjc/galaxy_mira/builds/82117820
In the above TravisCI ran no tests as nothing had changed compared to the Test Tool Shed. Here's the following test run here where I requested all the tests be run (magic keyword in the git commit):
https://github.com/peterjc/galaxy_mira/commit/70cad4e64a7f38bf6bfc3177bedafd504719b083 https://travis-ci.org/peterjc/galaxy_mira/builds/82123804
See also #7 where I described the planemo + TravisCI approach I'm trying on this galaxy_mira
branch.
Should we leave this open for finishing some of the missing functionality as of https://github.com/galaxyproject/planemo/commit/f798c7e29b2276ce68b828e72fc6a6460c73792b or file separate issues?
Whichever you'd prefer, but my vote is for new issues, I like churn :).
OK. I've filed issues for what I consider to the top priorities.
Quoting myself from earlier in this discussion: I'm increasingly finding I am reimplementing things already in the Galaxy Tool Shed code (with the risk of potentially interpreting the XML recipe slightly differently, which on the plus side could highlight some ambiguities in the recipe format).
Here's an example of the kind of ambiguity I was expecting: https://github.com/galaxyproject/planemo/pull/321 and https://github.com/galaxyproject/galaxy/issues/896
Related to #19 (testing
tool_dependencies.xml
without a tool shed), I would like to be able to run an install recipe from atool_dependencies.xml
file locally and/or turn it into a simple shell script for the current platform.(The platform specific actions could be turned into bash if statements if preferred)
This seems to overlap with https://github.com/jmchilton/shed2tap
e.g. https://github.com/peterjc/pico_galaxy/blob/master/tools/effectiveT3/tool_dependencies.xml
Would become something like this (assuming already in install directory as per XML convention):
I would then be able to run this within TravisCI with the advantages that the install recipe is not repeated (
tool_dependencies.xml
and.travis.yml
) and moreover I would actually be able to testtool_dependencies.xml
, e.g. https://github.com/peterjc/pico_galaxy/commit/243311cc50ab2675c5e6aa42524841e60e6602e8