I'm the administrator of this facility
Service that runs cwl workflows on VMs in a openstack cloud.
The project is made up of 3 scripts:
The major external components are:
Running job message flow (omitting Rabbitmq):
bespin-api posts a start_job message for lando
lando tells Openstack to creates VM that runs __lando_worker__
lando posts a stage_job message for __lando_worker__
__lando_worker__ downloads files for the job
__lando_worker sends stage_job_complete to lando__
lando posts a run_job message for __lando_worker__
__lando_worker__ runs the CWL workflow for the job
__lando_worker sends run_job_complete to lando__
lando posts a save_output message for __lando_worker__
__lando_worker__ runs the CWL workflow for the job
__lando_worker sends save_output_complete to lando__
lando tells Openstack to terminate the __lando_worker__ VM
Additionally lando reads and updates bespin-api job table as the job progresses.
Assumes you have installed Python 2.7, Openstack, Rabbitmq.
pip install git+git://github.com/Duke-GCB/lando.git
Run the docker image or use the development instructions at https://github.com/Duke-GCB/bespin-api/blob/master/README.md
This registers details about how to run the job in openstack (CPU/RAM, VM image, volume sizes)
See https://github.com/Duke-GCB/bespin-cwl/blob/master/scripts/post_questionnaire.sh for an example
Using the bespin superuser you created in the previous step go into the admin interface and setup a job.
There are two config files that are used by lando.
/etc/lando_config.yml
- this is the main configuration file used by the server program(lando)./etc/lando_worker_config.yml
- this is the configuration file used by the worker.
When using Openstack the server program creates and puts the worker's config file on the VM in the correct location./etc/lando_config.yml
file:# Rabbitmq settings
work_queue:
host: 10.109.253.74 # ip address of the rabbitmq
username: lando # username for lando server
password: secret1 # password for lando server
listen_queue: lando # queue that lando server should listen on
worker_username: worker # username for lando worker
worker_password: secret2 # password for lando worker
# General Openstack settings
cloud_settings:
auth_url: http://10.109.252.9:5000/v3
username: jpb67
password: secret3
user_domain_name: Default
project_name: jpb67 # name of the project we will add VMs to
project_domain_name: Default
# Bespin job API settings
bespin_api:
url: http://localhost:8000/api
username: jpb67
password: secret4
If you are running with valid openstack credentials you will not need to create a /etc/lando_worker_config.yml
file.
The lando service does this for you.
rabbitmqctl add_user lando secret1
rabbitmqctl set_permissions -p / lando ".*" ".*" ".*"
rabbitmqctl add_user worker secret2
rabbitmqctl set_permissions -p / worker ".*" ".*" ".*"
You can start lando by simply running lando
where it can see the /etc/lando_config.yml
config file.
/etc/lando_config.yml
At the end of /etc/lando_config.yml
add the following:
fake_cloud_service: True
This will cause lando to print a message telling you to run lando_worker.
/etc/lando_worker_config.yml
file for fake cloud service:host: 10.109.253.74
username: worker
password: secret2
queue_name: local_worker
The queue name local_worker
is always used for workers when fake_cloud_service
is True in /etc/lando_config.yml
.
If you are running on osx you may need to specify custom --tmpdir-prefix
and --tmp-outdir-prefix
flags for cwl.
You can replace the default cwl-runner
command by adding lines similar to these:
cwl_base_command:
- "cwl-runner"
- "--debug"
- "--tmpdir-prefix=/Users/jpb67/Documents/work/tmp"
- "--tmp-outdir-prefix=/Users/jpb67/Documents/work/tmp"
This command will put a job in the rabbitmq queue for the lando server to receive.
This reads the config from /etc/lando_config.yml
.
lando_client start_job 1
This command is just meant for testing purposes. In a typical use case this message would be queued by bespin-api.
This will listen for messages from the 'lando' rabbitmq queue.
This reads the config from /etc/lando_config.yml
.
lando
It should display that it has received a message to run a job.
Since we have set the fake_cloud_service: True
in /etc/lando_config.yml
instead of trying to launch a vm
it should print this message: Pretend we create vm: local_worker
.
Finally it should put a staging data message in the worker's queue.
This reads the config from /etc/lando_worker_config.yml
.
It should talk back and forth with lando server staging data, running, job and storing output.
lando_worker
lando_worker should terminate once it completes the job.