CloudVE / galaxy-helm

Minimal setup required to run Galaxy under Kubernetes
MIT License
0 stars 1 forks source link

Galaxy Helm Chart (v3)

Galaxy is a data analysis platform focusing on accessibility, reproducibility, and transparency of primarily bioinformatics data. This repo contains Helm charts for easily deploying Galaxy on top of Kubernetes.

TL;DR

git clone https://github.com/CloudVE/galaxy-kubernetes.git
cd galaxy-kubernetes/galaxy
helm dependency update
helm install .

Introduction

This [Helm chart]() bootstraps a Galaxy deployment on a Kubernetes cluster. The chart allows application configuration changes, updates, upgrades, and rollbacks.

Prerequisites

You will need a Kubernetes and Helm installation; the easiest option for testing and development purposes is to install Docker Desktop, which comes with integrated Kubernetes. You will also need to install Helm.

Dependency Charts

This chart relies on the features of other charts for common functionality. Most notably, this includes the Postgres chart for the database. In addition, the chart relies on the use of the CVMFS chart for linking the reference data to Galaxy and jobs. While, technically, CVMFS is an optional dependency, production settings will likely want it enabled.

Installing the Chart

  1. Clone this repository and install the required dependency charts.
git clone https://github.com/CloudVE/galaxy-kubernetes.git
cd galaxy-kubernetes/galaxy
helm dependency update
  1. To install the chart with the release name galaxy (note the trailing dot):
helm install --name galaxy .

In about a minute, Galaxy will be available at the root URL of your kubernetes cluster.

Uninstalling the Chart

To uninstall/delete the galaxy deployment, run:

helm del --purge galaxy

Configuration

The following table lists the configurable parameters of the Galaxy chart. The current default values can be found in values.yaml file.

Parameter Description
image.repository The repository and name of the Docker image for Galaxy pointing to Docker Hub.
image.tag Galaxy image tag / version
image.pullPolicy Galaxy image pull policy
service.type Kubernetes Service type
service.port Galaxy service port
webHandlers.replicaCount The number of replicas for the Galaxy web handlers
jobHandlers.replicaCount The number of replicas for the Galaxy job handlers
rbac.enabled Enable Galaxy job RBAC
persistence.enabled Enable persistence using PVC
persistence.name Name of the PVC
persistence.storageClass PVC Storage Class for Galaxy volume (use either this or existingClaim)
persistence.existingClaim An existing PVC to be used for the Galaxy volume (use either this or storageClass)
persistence.accessMode PVC access mode for the Galaxy volume
persistence.size PVC storage request for the Galaxy volume, in GB
persistence.mountPath Path where to mount the Galaxy volume
extraEnv Any extra environment variables you would like to pass on to the pod
ingress.enabled Enable Kubernetes ingress
ingress.path Path where Galaxy application will be hosted
ingress.hosts Cluster hosts where Galaxy will be available
useSecretConfigs Enable Kubernetes Secrets for all config maps
configs.* Galaxy configuration files and values for each of the files. The provided value represent the entire content of the given configuration file.
jobs.rules Galaxy dynamic job rules

Specify each parameter using the --set key=value[,key=value] argument to helm install. For example,

helm install --name galaxy --set persistence.size=50 .

The above command sets the Galaxy persistent volume to 50GB.

Setting Galaxy configuration file values requires the key name to be escaped:

helm install --set-file "configs.galaxy\.yml"=/path/to/local/galaxy.yml

To unset an existing file and revert to the container's default version:

helm install --set-file "configs.job_conf\.xml"=null

Alternatively, a YAML file that specifies the values of the parameters can be provided when installing the chart. For example,

helm install --name galaxy -f values-cvmfs.yaml .

To unset a config file, use the yaml null type:

configs:
  job_conf.xml: ~

Data Persistence

The Galaxy Docker image stores all user data under /galaxy/server/database path of the container. Persistent Volume Claims (PVCs) are used to keep the data across deployments. It is possible to specify en existing PVC via persistence.existingClaim. Alternatively, a value for persistence.storageClass can be supplied to designate a desired storage class for dynamic provisioning of the necessary PVCs. If neither value is supplied, the default storage class for the K8s cluster will be used.

We recommend a storage class that supports ReadWriteMany, such as the nfs-provisioner as the data must be available to all nodes in the cluster.

In addition, we recommend that you also set postgresql.persistence.storageClass to a high-speed, durable storage type that is ReadWriteOnce, such as an EBS volume.

Production Settings

This repo contains an additional values file with the production settings, called values-cvmfs.yaml. This mode of deployment configures Galaxy with the data from CMVFS and replicates the functional capabilities of the Galaxy Main server. Note that this deployment mode does not work on a Mac laptop because of an unresolved issue in the CVMFS-CSI docker container.

To install this version of the chart, we first need to install the Galaxy CVMFS-CSI chart, followed by the Galaxy chart. Depending on the setup of the cluster you have available, you may also need to supply values for the cluster storage classes or PVCs.

helm repo add cloudve https://raw.githubusercontent.com/CloudVE/helm-charts/master/
helm repo update
kubectl create namespace cvmfs
helm install --name cvmfs --namespace cvmfs cloudve/galaxy-cvmfs-csi
# Download values-cvmfs.yaml from this repo and update persistence as needed
helm install --name galaxy -f values-cvmfs.yaml cloudve/galaxy

Note that this setup takes several minutes to start due to Galaxy loading all the tool definitions. Once started, Galaxy will be available under /galaxy/ (note the trailing / as it is required).

Horizontal Scaling

The Galaxy application can be horizontally scaled for the web and job handlers by setting the desired values of the webHandlers.replicaCount and jobHandlers.replicaCount configuration options.

Funding