galaxyproject / usegalaxy-playbook

Ansible Playbook for usegalaxy.org
Academic Free License v3.0
30 stars 25 forks source link

usegalaxy.* deduplication #118

Open natefoo opened 6 years ago

natefoo commented 6 years ago

@nekrut has tasked me with getting tools and data on usegalaxy.org unified with those on usegalaxy.eu and updated regularly as is done on usegalaxy.eu. In meeting w/ @erasche and @bgruening in Freiburg this week we've come up with the following. This could also be of use to usegalaxy.org.au (cc: @Slugger70).

Tool synchronization

It's not desirable for us to support all of usegalaxy.eu's legacy tools, nor for them to support all of ours. As a result, we propose the creation of a new CVMFS repository (for now, we'll call it usegalaxy-tools.galaxyproject.org) that can be shared by both instances containing a common set of tools to be available on both.

Dependencies will be provided via Singularity, and the necessary images are already being built and upload to depot. In addition to nice containerized dependency management, this also gives us better security through execution sandboxing. These will be mirrored to a second new CVMFS repository, say, singularity.galaxyproject.org (and ultimately, that repository will become the canonical source, rather than depot), that can be mounted directly on to a cluster running Galaxy jobs, so that all the images are automatically available to instances without consuming local storage space.

Thus, we should be able to have a CVMFS stratum 0 where, much like we do for Test/Main now, we:

  1. Open a CVMFS transaction
  2. Spin up a Galaxy instance in Docker
  3. Install tools
  4. Destroy the instance
  5. Publish the transaction
  6. Force snapshots on stratum 1s
  7. When client caches invalidate, usegalaxy.* servers reload their toolboxes (this will require fixing the multi-host toolbox reloading bug described in galaxyproject/galaxy#5601)

Unlike Test/Main, this instance will be entirely unconnected to Test/Main's database. Other than the install database (sqlite) and shed_*.xml, this instance should be able to be entirely ephemeral.

Issues

There can only be one install database per instance, but I suspect the only time when the install database matters as far as running tools is concerned is when tool shed dependencies are used, so that is not a concern here. It will have implications on the admin UI (e.g. they probably won't show up in Manage Tools) and maybe tool help text, but I think we can live with the former and fix the latter at some point. But as usual when I think something will be trivially easy, this is probably not going to work like I think.

Tool sections/ordering is an issue. Galaxy needs to be able to write to integrated_tool_panel.xml, so it cannot live in CVMFS, but without a unified integrated_tool_panel.xml, sections and tools will be ordered very differently across instances. Additionally, we will have to unify our section IDs since the section id into which a tool is installed is stored in shed_tool_conf.xml (which will live in CVMFS).

The section unification should be a one-time process that we can undertake by hand together. The integrated_tool_panel.xml synchronization, I don't have a good solution for at the moment.

Likewise, new tool installs that need changes from the default destination - things like the number of slots, amount of memory, etc. - will need to be coordinated when these tools are installed to the common repo.

For the time being, we will need to map the common tools to the singularity destination(s) one-by-one. This is easier for usegalaxy.eu due to their handy job-config-as-a-service (JCaaS). Although I'm not necessarily sold on the idea of a web service, I am not at all happy with the way job config works for usegalaxy.org and I do think we can utilize the JCaaS idea to get a similar dynamic config. However, with new tool versions living in the common repo and old versions living in the old repo, some tools will have to be mapped by version, not just by versionless ID, so this will get ugly. A better solution is needed here.

@bgruening discovered that the Singularity image path is configurable through the container_image_cache_path option, but this is a bit of a hack. Setting it causes Galaxy to check <container_image_cache_path>/singularity/mulled for images matching the requirements, but if the needed images are not available, it will try to build them, which of course is not possible at runtime with CVMFS (nor is it desirable, we want them to always be preinstalled). It would be good if we could control whether building should be attempted, and also have a destination param to control where to find Singularity images. This should probably also be subdirectoried since the list is likely to grow larger than the CVMFS preferred catalog size. Something like /cvmfs/singularity.galaxyproject.org/c/o/coreutils:8.25--0 would probably be sufficient.

Action items

Testing

Development

Data synchronization

To be written...

Galaxy synchronization

We'd also like to have usegalaxy.org and usegalaxy.eu run off a single copy of Galaxy living in CVMFS. This presents a few challenges, such as non-upstreamed datatypes on usegalaxy.eu and synchronization of updates, especially where database migrations are concerned.

More to be written about this as well...

Singularity all the things

extracted to https://github.com/galaxyproject/usegalaxy-playbook/issues/262

Slugger70 commented 6 years ago

Thanks for the cc @natefoo. I really like this idea and it's what we've been pushing for. We already have a CVMFS tier 1 server running with the ref data and so it wouldn't be too difficult to get this running too. As for synchronisation of the integrated_tool_panel.xml file, could the one time manual work on this could happen at GCC this year (when we're all in the same room?)

natefoo commented 6 years ago

We already have a CVMFS tier 1 server running with the ref data

This is great! Is this just for internal consumption or should we publicize it on https://galaxyproject.org/admin/reference-data-repo/ and http://datacache.galaxyproject.org/ ?

As for synchronisation of the integrated_tool_panel.xml file, could the one time manual work on this could happen at GCC this year (when we're all in the same room?)

You mean the section ID and name changes? I think this is a great idea. Maybe we can also figure out how to keep it in sync when we're there. Job conf changes (most importantly - running multi-slot tools with multiple slots, or increasing memory for tools that use a lot) for new tools will need to be coordinated in some way as well.

natefoo commented 6 years ago

I've made some good progress so far today. With the following modifications I am able to load a tool from a shed_tool_conf.xml in CVMFS without the corresponding install database:

diff --git a/lib/galaxy/tools/toolbox/base.py b/lib/galaxy/tools/toolbox/base.py
index 2ce4439..2f968d1 100644
--- a/lib/galaxy/tools/toolbox/base.py
+++ b/lib/galaxy/tools/toolbox/base.py
@@ -569,7 +569,7 @@ class AbstractToolBox(Dictifiable, ManagesIntegratedToolPanelMixin):
                 tool.hidden = True
             key = 'tool_%s' % str(tool.id)
             if can_load_into_panel_dict:
-                if guid and not from_cache:
+                if guid and tool_shed_repository and not from_cache:
                     tool.tool_shed = tool_shed_repository.tool_shed
                     tool.repository_name = tool_shed_repository.name
                     tool.repository_owner = tool_shed_repository.owner
@@ -623,7 +623,7 @@ class AbstractToolBox(Dictifiable, ManagesIntegratedToolPanelMixin):
                                                     installed_changeset_revision=installed_changeset_revision)
         if not repository:
             msg = "Attempted to load tool shed tool, but the repository with name '%s' from owner '%s' was not found in database" % (repository_name, repository_owner)
-            raise Exception(msg)
+            log.warning(msg)
         return repository

     def _get_tool_shed_repository(self, tool_shed, name, owner, installed_changeset_revision):

What I did:

  1. Precreated install.sqlite using create_db.sh

  2. Open a transaction on sandbox.galaxyproject.org

  3. Create /cvmfs/sandbox.galaxyproject.org/tools and /cvmfs/sandbox.galaxyproject.org/config, copy in install.sqlite and shed_tool_conf.xml:

    <?xml version="1.0"?>
    <toolbox tool_path="/cvmfs/sandbox.galaxyproject.org/tools">
        <section id="usegalaxy_common_tools_test" name="usegalaxy.* common tools test">
        </section>
    </toolbox>
  4. chowned everything above to 1450:1450 (the UID and GID of the galaxy user in docker-galaxy-stable)

  5. Start docker-galaxy-stable:

    sandbox@cvmfs0-psu0$ docker run -d -p 8080:80 -e GALAXY_CONFIG_INSTALL_DATABASE_CONNECTION=sqlite:////cvmfs/sandbox.galaxyproject.org/config/install.sqlite -e GALAXY_CONFIG_TOOL_CONFIG_FILE=/cvmfs/sandbox.galaxyproject.org/config/shed_tool_conf.xml -e GALAXY_CONFIG_MASTER_API_KEY=a60913da2ea2177d89e33884f0326f7d3bcdd901 -v /cvmfs/sandbox.galaxyproject.org:/cvmfs/sandbox.galaxyproject.org bgruening/galaxy-stable
    ccb51acdcd43992c8d7c735108ade9e714e7b31de7f9a7383e55232f6b74b1ea
  6. Install the jq tool from IUC in to /cvmfs/sandbox.galaxyproject.org/tools using ephemeris:

    ---
    
    api_key: a60913da2ea2177d89e33884f0326f7d3bcdd901
    galaxy_instance: http://cvmfs0-psu0.galaxyproject.org:8080
    install_tool_dependencies: false
    install_resolver_dependencies: false
    tools:
    - name: jq
      owner: iuc
      tool_panel_section_id: usegalaxy_common_tools_test
    (ephemeris)nate@weyerbacher% shed-tools install -v -g http://cvmfs0-psu0.galaxyproject.org:8080/ -t tools.yaml
    (1/1) Installing repository jq from iuc to section "usegalaxy_common_tools_test" at revision 5ff75eb1a893 (TRT: 0:00:00.130621)
        repository jq installed successfully (in 0:00:09.244931) at revision 5ff75eb1a893
    Installed repositories (1): [('jq', None)]
    Skipped repositories (0): []
    Errored repositories (0): []
    All repositories have been processed.
    Total run time: 0:00:09.376930
  7. Fetched jq:1.5--0 to /cvmfs/sandbox.galaxyproject.org/singularity/mulled/

  8. Published the CVMFS transaction

  9. Added /cvmfs/sandbox.galaxyproject.org/shed_tool_conf.xml to Test's tool_config_file

  10. Apply the patch above to Test (in the test.galaxyproject.org CVMFS repo) and restart

natefoo commented 6 years ago

This patch is just a quick hack obviously, and there are going to be some issues. It is not recognized as a TS tool, so the tool ID is simply its short id, there is no link to the TS on the tool form, it probably breaks versioning/lineage, etc. Hopefully we can fix much of this just using the data already available in the XML.

mvdbeek commented 6 years ago

it probably breaks versioning/lineage

that should work anyway. Also xref https://github.com/galaxyproject/galaxy/issues/5284

natefoo commented 6 years ago

@mvdbeek thanks! That'll make things much easier.

Working on this in natefoo/galaxy@installdbless-shed-tools for anyone interested in following along.

jmchilton commented 6 years ago

if the needed images are not available, it will try to build them

This behavior can be customized by setting up a container resolvers file with the container resolver configurations you wish to use.

natefoo commented 6 years ago

@jmchilton I sorta noticed that might be possible in the code but hadn't figured out the syntax of the file.

jmchilton commented 6 years ago

https://github.com/galaxyproject/ansible-galaxy-extras/blob/master/templates/container_resolvers_conf.xml.j2

natefoo commented 6 years ago

Thanks!

natefoo commented 5 years ago

Some progress today.

Install DB-less tool loading is in galaxyproject/galaxy#7316.

@erasche suggested using OverlayFS in Travis to perform the installations, which I think should work, assuming Travis VMs have overlayfs in the kernel. It'll be relatively simple since we don't need to worry about deletions. Roughly:

  1. Install CVMFS
  2. Configure CVMFS to mount somewhere other than /cvmfs, say, /lower
  3. mkdir /upper /work /cvmfs
  4. mount -t overlay overlay -o lowerdir=/lower,upperdir=/upper,workdir=/work /cvmfs
  5. docker run -d -p 8080:80 -e GALAXY_CONFIG_INSTALL_DATABASE_CONNECTION=sqlite:////cvmfs/usegalaxy.galaxyproject.org/config/install.sqlite -e GALAXY_CONFIG_TOOL_CONFIG_FILE=/cvmfs/usegalaxy.galaxyproject.org/config/shed_tool_conf.xml -e GALAXY_CONFIG_MASTER_API_KEY=deadbeef -e GALAXY_CONFIG_CONDA_PREFIX=/cvmfs/usegalaxy.galaxyproject.org/dependencies/conda -v /cvmfs/usegalaxy.galaxyproject.org:/cvmfs/usegalaxy.galaxyproject.org bgruening/galaxy-stable
  6. galaxy-wait ...
  7. shed-tools -g http://localhost:8080/ -a deadbeef ...
  8. planemo test ...
  9. ssh usegalaxy@cvmfs0-psu0.galaxyproject.org cvmfs_server transaction usegalaxy.galaxyproject.org
  10. rsync -av /upper/ usegalaxy@cvmfs0-psu0.galaxyproject.org:/cvmfs/usegalaxy.galaxyproject.org || { ssh usegalaxy@cvmfs0-psu0.galaxyproject.org cvmfs_server abort -f sandbox.galaxyproject.org; travis_terminate 1; }
  11. ssh usegalaxy@cvmfs0-psu0.galaxyproject.org cvmfs_server publish ... sandbox.galaxyproject.org
natefoo commented 5 years ago

Proof 'o concept: https://travis-ci.org/natefoo/usegalaxy-tools/builds/489823965