ARTbio / GalaxyKickStart

Ansible playbooks for Galaxy Server deployment
GNU General Public License v3.0
24 stars 22 forks source link

tool installation hangs at Install Tool Shed tools #228

Closed luke-c-sargent closed 7 years ago

luke-c-sargent commented 7 years ago

I am attempting to kickstart an instance with a suite of tools and a workflow using a bootstrap user I do not delete, leaving it to be the default admin. All of my tools install happily but one:

I've isolated it using this yaml:

tools:
- name: regtools_junctions_extract
  owner: yating-l
  revision: 01ed8e112f2a
  tool_panel_section_label: Galaxy OnRamp Tools
  tool_shed_url: https://toolshed.g2.bx.psu.edu

It installs fine in the GUI admin panel. If I use KickStart to create the instance without the offending tool and then use the galaxy-tools role with the created API key (not creating a bootstrap user) separately, it installs successfully after about 2 minutes (though the Galaxy Instance hangs for a bit). However, it fails on Restart Galaxy in docker container:

fatal: [192.168.56.18]: FAILED! => {"changed": true, "cmd": ["docker", "exec", "galaxy", "supervisorctl", "restart", "galaxy:"], "delta": "0:00:00.049829", "end": "2017-05-03 12:31:47.500686", "failed": true, "rc": 1, "start": "2017-05-03 12:31:47.450857", "stderr": "Error response from daemon: No such container: galaxy", "stdout": "", "stdout_lines": [], "warnings": []}

Some other points of note:

drosofff commented 7 years ago

@luke-c-sargent Thank you for your note, I'll investigate as soon as possible and keep you posted

luke-c-sargent commented 7 years ago

this may be unrelated to GalaxyKickStart, as I have been able to replicate the behavior with a bioblend script (I use GalaxyKickStart to create an instance without tools or a workflow). Installing the tool makes the script hang, leaving the tool at 'Installing dependencies.' If I Ctrl+C, then uninstall the tool through the GUI, then re-run the script, it installs successfully.

drosofff commented 7 years ago

@luke-c-sargent I think this is related to changes in defaults setting in release_17.01 and/or changes in default values in https://github.com/galaxyproject/ephemeris/blob/master/ephemeris/shed_install.py#L62 (as far as GKS is concerned, the install tool role uses this script).

In any case, try these settings in your tool_list.yml, for me it works, the tools seems to install correctly:

- name: regtools_junctions_extract
  owner: yating-l
  revision: 01ed8e112f2a
  tool_panel_section_label: Galaxy OnRamp Tools
  tool_shed_url: https://toolshed.g2.bx.psu.edu
  install_tool_dependencies: True
  install_resolver_dependencies: False

I did not had time to test, but it is possible that if you change galaxy_changeset_id: release_17.01 to galaxy_changeset_id: release_16.07 in group_vars/all, it also works without specifying the install_tool_dependencies and install_resolver_dependencies variables in the tool list.

I'll try for my own understanding

drosofff commented 7 years ago

@luke-c-sargent I have tried with release_16.07 Same problem, and same solution:

does not work with

- name: regtools_junctions_extract
  owner: yating-l
  revision: 01ed8e112f2a

and works with

- name: regtools_junctions_extract
  owner: yating-l
  revision: 01ed8e112f2a
  tool_panel_section_label: Galaxy OnRamp Tools
  tool_shed_url: https://toolshed.g2.bx.psu.edu
  install_tool_dependencies: True
  install_resolver_dependencies: False

Thus I would conclude that it is related - for GalaxyKickStart - to the PR https://github.com/galaxyproject/ephemeris/pull/26 But I would not call it a bug. The idea is to migrate to conda dependencies Ping @bgruening

bgruening commented 7 years ago

Oo did I broke something? This PR just introduced global variables. So you can write above the tools section:

  install_repository_dependencies: true
  install_resolver_dependencies: true
  install_tool_dependencies: false

And this is valid for all tools. In theory it should just avoid a lot of redundant lines.

drosofff commented 7 years ago

@bgruening As I said, so far it's ok with me. To be complete we can also remind @luke-c-sargent that if

  install_repository_dependencies: true
  install_resolver_dependencies: true
  install_tool_dependencies: false

are globally declared above the tools section, each tool can be tuned by overwriting these variables as needed.

As far as I am concerned, I am still investigating some cases where install_resolver_dependencies: true && install_tool_dependencies: true does install a conda env but no toolshed dependencies, with a final crash. But I would not say tonight that this is related to the ephemeris PR#26. I just don't know yet.

luke-c-sargent commented 7 years ago

@bgruening thanks for the tip -- I was totally going to write a bunch of redundant lines. @drosofff this works well for an established galaxy instance without tools (instance installed by skipping the install_tools tag, tools installed by only running install_tools), but still stalls when attempting to do the entire playbook.

permutations i've tried:

if i abort the playbook during hang, uninstall the tool and then attempt to reinstall it with bioblend or with the GKS playbook install_tools tag, it works (setting install_resolver_dependencies to either true or false)

should I try to circumvent this by adding any dependencies to (bio)conda? thanks for the responsiveness!

drosofff commented 7 years ago

Mmmmm. can you put a link to your tool_list.yml ? Right now I have no idea, your tool installed well, although it's true that I tested it alone.

luke-c-sargent commented 7 years ago

This is the original version, created with the GKS generate_tool_list_from_ga_workflow_files.py script (with the problematic tool first). In trying to use your fix i copied it verbatim and replaced that tool's entry.

So you were able to run the entire GKS playbook with that tool installation without any issues?

luke-c-sargent commented 7 years ago

it may or may not be relevant but I am targeting an Ubuntu 14.04 VM

drosofff commented 7 years ago

Yes I was able exactly as I mentioned previously. When it crashed the first time (with resolver_dependencies: true, I noticed that cmake was freezed while compilation. I'll give it a shot with your list when I have time.

Ubuntu 14.04 is OK. I am working on 16.04 but it still needs some coding

luke-c-sargent commented 7 years ago

thanks for all your help -- if it worked for you with just that tool and not for me then it must be some other configuration issue I am overlooking; I don't expect that particular tool list to fail on you if you change the resolver_dependencies for the problem tool.

drosofff commented 7 years ago

mmmm. Not sure. I am struggling with interferences between conda packages installed for some tools and toolshed dependencies for another one... Maybe not related but I am suspicious

mvdbeek commented 7 years ago

There is some confusion here, I think.

install_resolver_dependencies for now controls the installation of tools via conda (or possibly other resolvers in the future).

install_repository_dependencies controls the installation of repository dependencies. These are really mercurial repositories, that appear as separate entities in the toolshed. You need to set this to True if you're going to install suites for example (though I don't think ephemeris supports this.).

install_tool_dependencies controls if tool dependencies should be installed. These are taken from tool_dependencies.xml files in repositories (which can be included with a tool, or as a separate package). If a tool dependency is included in another repository (e.g package_r_foo), but install_repository_dependencies is set to false this has no effect.

In net, you'll probably want to set install_resolver_dependencies and install_repository_dependencies to True and install_tool_dependencies to False. Then you should really update to at least galaxy 17.01, which contains some fixes to bugs that may cause galaxy to hang indefinitely while installing conda dependencies. I don't remember if we backported all of these fixes to 16.07.

I would suggest to not bother with tool dependencies, I don't think these are really compatible with automatically building images. It may work one day and fail the next. Building conda packages is really a piece of cake if you already have managed to build a tool shed package.

mvdbeek commented 7 years ago

mmmm. Not sure. I am struggling with interferences between conda packages installed for some tools and toolshed dependencies for another one... Maybe not related but I am suspicious

This can indeed happen when the toolshed resolver takes a higher priority than the conda resolver (which it does by default). I think for this project it would be appropriate to promote conda to the highest priority resolver.

mvdbeek commented 7 years ago

And finally it would be a good idea to upgrade miniconda to 4.2.13 in the python3 variant. This is required for recently built packages to install and work properly.

drosofff commented 7 years ago

install_resolver_dependencies for now controls the installation of tools via conda (or possibly other resolvers in the future).

install_repository_dependencies controls the installation of repository dependencies. These are really mercurial repositories, that appear as separate entities in the toolshed. You need to set this to True if you're going to install suites for example (though I don't think ephemeris supports this.).

install_tool_dependencies controls if tool dependencies should be installed. These are taken from tool_dependencies.xml files in repositories (which can be included with a tool, or as a separate package). If a tool dependency is included in another repository (e.g package_r_foo), but install_repository_dependencies is set to false this has no effect.

In net, you'll probably want to set install_resolver_dependencies and install_repository_dependencies to True and install_tool_dependencies to False. Then you should really update to at least galaxy 17.01, which contains some fixes to bugs that may cause galaxy to hang indefinitely while installing conda dependencies. I don't remember if we backported all of these fixes to 16.07.

@mvdbeek thanks for the note 👍 I think the problem is more likely the confusion between 'revision' and 'revisions'

For the rest, I totally agree. GKS is now with in 17.01 and the local servers will be soon too (also this not as problematic since they are still managed by hands).

As far as miniconda is concerned, where can have some documentation/notes to speed up the upgrade ?

mvdbeek commented 7 years ago

As far as miniconda is concerned, where can have some documentation/notes to speed up the upgrade ?

For GalaxyKickStart you just need to change the variables in https://github.com/ARTbio/GalaxyKickStart/blob/master/group_vars/all#L34 and https://github.com/ARTbio/GalaxyKickStart/blob/master/group_vars/all#L35

For the existing servers you remove (or move) the old conda directory and install a new clean miniconda3. You then need to install all conda dependencies again. 17.01 has a new UI to manage conda dependencies under 'Manage tool dependencies'. That should make this a one-click operation.

mvdbeek commented 7 years ago

Ahh, the new UI is only available in 17.05, but 17.05 is already in rc1, it should be released in about 2 weeks. You can also just stay with miniconda2 but update it to 4.2.13. You put galaxy's conda/bin directory on your PATH and do conda install conda=4.2.13. Then you don't need to reinstall the conda dependencies for now.

drosofff commented 7 years ago

@luke-c-sargent I have updated the master branch of the repo with a revised version of https://github.com/ARTbio/GalaxyKickStart/blob/master/scripts/galaxykickstart_from_workflow.py.

Your issue was most probably the same as mine: the key 'revision' should have been 'revisions'. If you rerun the script, or edit your tool list it should work now (taking into account the precisions given pas @mvdbeek)

Note that for the test of your tool, the revisions tag was not taken into account, thus the last version of the tool was used, which was ok since there is only one tool version in the repo !

luke-c-sargent commented 7 years ago

thanks for the further consideration and insight.

unfortunately, I am still encountering the same issue, using revisions instead of revision, using various combinations of flags. when install_tool_dependencies is set to False the process completes, but the tool's dependency regtools is never installed, which is less than desirable. if i attempt to install the dependency from the GUI, it never completes , while cmake sleeps in the background, as it does in every failure case.

Since conda is the future, i think i will address this issue by making a bioconda recipe for the tools.

thanks for the time and effort; i am glad that the issue has brought other things to light and has helped push GKS forward a bit.

drosofff commented 7 years ago

Sure @luke-c-sargent ! Thanks, your issue really helped me to fix the code ! However in the meantime, with your tool list reformatted with revisions: AND line break - revisionchangeset as follows, it seems to install smoothly (and it's true, slowly) in a GKS instance.

install_tool_dependencies: True
install_repository_dependencies: True
install_resolver_dependencies: False

tools:
- name: regtools_junctions_extract
  owner: yating-l
  revisions:
  - 01ed8e112f2a
  tool_panel_section_label: Galaxy OnRamp Tools
  tool_shed_url: https://toolshed.g2.bx.psu.edu
- name: ucsc_pslpostarget
  owner: yating-l
  revisions:
  - fe51c5a974b5
  tool_panel_section_label: Galaxy OnRamp Tools
  tool_shed_url: https://toolshed.g2.bx.psu.edu
- name: ncbi_blast_plus
  owner: devteam
  revisions:
  - 3034ce97dd33
  tool_panel_section_label: Galaxy OnRamp Tools
  tool_shed_url: https://toolshed.g2.bx.psu.edu
- name: snap
  owner: yating-l
  revisions:
  - 57299471d6c1
  tool_panel_section_label: Galaxy OnRamp Tools
  tool_shed_url: https://toolshed.g2.bx.psu.edu
- name: augustus
  owner: bgruening
  revisions:
  - f5075dee9d6b
  tool_panel_section_label: Galaxy OnRamp Tools
  tool_shed_url: https://toolshed.g2.bx.psu.edu
- name: ucsc_pslcdnafilter
  owner: yating-l
  revisions:
  - ceec5a5fe894
  tool_panel_section_label: Galaxy OnRamp Tools
  tool_shed_url: https://toolshed.g2.bx.psu.edu
- name: psltobigpsl
  owner: yating-l
  revisions:
  - 7cd07dd27927
  tool_panel_section_label: Galaxy OnRamp Tools
  tool_shed_url: https://toolshed.g2.bx.psu.edu
- name: gbtofasta
  owner: yating-l
  revisions:
  - 9573618e2afe
  tool_panel_section_label: Galaxy OnRamp Tools
  tool_shed_url: https://toolshed.g2.bx.psu.edu
- name: hisat2
  owner: iuc
  revisions:
  - 2ec097c8e843
  tool_panel_section_label: Galaxy OnRamp Tools
  tool_shed_url: https://toolshed.g2.bx.psu.edu
- name: multi_fasta_glimmer_hmm
  owner: rmarenco
  revisions:
  - 0ddb5ee32ff6
  tool_panel_section_label: Galaxy OnRamp Tools
  tool_shed_url: https://toolshed.g2.bx.psu.edu
- name: hubarchivecreator
  owner: rmarenco
  revisions:
  - 7ddf651457df
  tool_panel_section_label: Galaxy OnRamp Tools
  tool_shed_url: https://toolshed.g2.bx.psu.edu
- name: ucsc_blat
  owner: yating-l
  revisions:
  - '951076264957'
  tool_panel_section_label: Galaxy OnRamp Tools
  tool_shed_url: https://toolshed.g2.bx.psu.edu
- name: blastxmltopsl
  owner: rmarenco
  revisions:
  - 5e8fd7791c7f
  tool_panel_section_label: Galaxy OnRamp Tools
  tool_shed_url: https://toolshed.g2.bx.psu.edu
- name: trfbig
  owner: rmarenco
  revisions:
  - e45bd0ffc1a4
  tool_panel_section_label: Galaxy OnRamp Tools
  tool_shed_url: https://toolshed.g2.bx.psu.edu
- name: bamtobigwig
  owner: yating-l
  revisions:
  - 61f39c77b13d
  tool_panel_section_label: Galaxy OnRamp Tools
  tool_shed_url: https://toolshed.g2.bx.psu.edu
- name: ucsc_pslcheck
  owner: yating-l
  revisions:
  - 68f0da46b7e2
  tool_panel_section_label: Galaxy OnRamp Tools
  tool_shed_url: https://toolshed.g2.bx.psu.edu
- name: stringtie
  owner: iuc
  revisions:
  - c84d44519b2e
  tool_panel_section_label: Galaxy OnRamp Tools
  tool_shed_url: https://toolshed.g2.bx.psu.edu
drosofff commented 7 years ago

I have launched a VM here, all tools installed ! http://192.54.201.35/