isi-vista / vista-pegasus-wrapper

A higher-level API for ISI Pegasus, adapted to the quirks of the ISI Vista group
MIT License
2 stars 1 forks source link

Improved handling of docker images as a service during a workflow #97

Closed lichtefeld closed 3 years ago

lichtefeld commented 3 years ago

In some workflows, ISI & collaborators use docker images to provide always-on services (such as a DB). The default position for Pegasus launching a docker image as part of a job only has the image survive the length of the job, instead if the image needs to persist until potentially the end of all documents for processing then the default process will not work. Alternatively, a bash script can be launched as a pegasus job which handles starting up the docker image then a second script shutdown the image at the end.

To enable this approach a few modifications are needed: [ ] Scripts to start and stop the images (based on examples from @elizlee) [ ] Support to stage files into and out of the dockermount [ ] Friction-free submission of jobs running in the container:

[ ] Change the default configuration type from "sharedfs" if a container is being used

We also don't want to entirely remove the ability to service a docker image for a one-time job, so care should be taken to configure the settings appropriately.

These changes are easier if file input/output management can be ignored on the wrapper backend for the moment as currently, we don't have a good way to handle files that are tracked in parameter files for python jobs. @elizlee -- Would your desired use case for this addition need to be able to input / output files from the active container? Alternatively, if I just extend support for manual handling of files at the moment would that be ok? Then I could push automatic file handling to when the wrapper gets revamped to support files as inputs and outputs via Pegasus rather than our own built-in assumption of running on a shared NAS.