MaastrichtU-IDS / d2s-argo-workflows

⚠️ DEPRECATED: Argo workflows to transform structured data to a target RDF using Data2Services Docker modules
http://d2s.semanticscience.org/
MIT License
3 stars 0 forks source link

Run workflows with Argo

See d2s.semanticscience.org for detailed documentation.

Requirements

Install oc client

wget https://github.com/openshift/origin/releases/download/v3.11.0/openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit.tar.gz

tar xvf openshift-origin-client-tools*.tar.gz
cd openshift-origin-client*/
sudo mv oc kubectl /usr/local/bin/

Install Argo

Install Argo client


Run workflows

oc login

Login to the cluster using the OpenShift client

oc login https://openshift_cluster:8443 --token=MY_TOKEN

Run examples

As example we will use config files from d2s-transform-template. Clone it with the argo submodule:

git clone --recursive https://github.com/MaastrichtU-IDS/d2s-transform-template.git
cd d2s-transform-biolink

Run oc login to connect to the OpenShift cluster.

# steps based workflow
argo submit d2s-argo-workflows/workflows/d2s-workflow-transform-xml.yml \
  -f support/config/config-transform-xml-drugbank.yml

# DAG workflow
argo submit d2s-argo-workflows/workflows/d2s-workflow-transform-xml-dag.yml \
  -f support/config/config-transform-xml-drugbank.yml

# Test
argo submit --watch d2s-argo-workflows/workflows/d2s-workflow-sparql.yml

Check running workflows

argo list

oc commands

List pods

oc get pod

Create pod from JSON

oc create -f examples/hello-openshift/hello-pod.json

Workflow administration

Create persistent volume

https://app.dsri.unimaas.nl:8443/console/project/argo/create-pvc

Mount filesystem

Deploy a filebrowser on MapR to access volumes

Go to https://app.dsri.unimaas.nl:8443/console/catalog > click Deploy image

Create temporary volume in the workflow

volumeClaimTemplates:                 # define volume, same syntax as k8s Pod spec
  - metadata:
      name: workdir                     # name of volume claim
      annotations:
        volume.beta.kubernetes.io/storage-class: maprfs-ephemeral
        volume.beta.kubernetes.io/storage-provisioner: mapr.com/maprfs
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 100Gi 

Create secret

Now in the workflow definition you can use the secret as environment variable

- name: d2s-sparql-operations
  inputs:
    parameters:
    - name: sparql-queries-path
    - name: sparql-input-graph
    - name: sparql-output-graph
    - name: sparql-service-url
    - name: sparql-triplestore-url
    - name: sparql-triplestore-repository
    - name: sparql-triplestore-username
  container:
    image: umids/d2s-sparql-operations:latest
    args: ["-ep", "{{inputs.parameters.sparql-triplestore-url}}", 
      "-rep", "{{inputs.parameters.sparql-triplestore-repository}}", 
      "-op", "update", "-f", "{{inputs.parameters.sparql-queries-path}}",
      "-un", "{{inputs.parameters.sparql-triplestore-username}}", 
      "-pw", "{{inputs.parameters.sparql-triplestore-password}}",
      "-pw", "$SPARQLPASSWORD",  # secret from env
      "--var-input", "{{inputs.parameters.sparql-input-graph}}",
      "--var-output", "{{inputs.parameters.sparql-output-graph}}", 
      "--var-service", "{{inputs.parameters.sparql-service-url}}", ]
    env:
    - name: SPARQLPASSWORD
      valueFrom:
        secretKeyRef:
          name: d2s-sparql-password
          key: password