clamsproject / wfe-pipeline-runner

3 stars 1 forks source link

Use different configuration file #7

Closed marcverhagen closed 3 years ago

marcverhagen commented 3 years ago

The pipeline configuration file is a docker-compose configuration file. This is different from what is done in https://github.com/clamsproject/appliance which uses a much simpler configuration file:

archive_path: ~/demo-archive-less

apps:

  audio-segmenter:
    enabled: True
    repository: https://github.com/clamsproject/app-audio-segmenter.git
    description: "Audio segmentation based on acoustic classifier"
    type: Audio

Adopt that approach or something like it because it is much simpler for the user. One thing I may want to change is to add an image property. The name of the application container for the app in the above example would be audio-segmenter.

marcverhagen commented 3 years ago

Addendum. Any config file that can be used by the appliance code should be usable here as well.

marcverhagen commented 3 years ago

One problem with the configuraiton file used by the appliance repository is that it does not need to define the order since that is done in Galaxy. This would introduce a somewhat obscure difference between the two where order is meaningful for pipelines but not for appliances. And there are elements in the appliance configuration file that mean nothing to the pipeline script as currently constructed, like enabled, description and type.

Yet, a lot of the docker-compose configuration is boilerplate that can be generated automatically. I am now thinking to use something like the following


mount: /local/data
services:
  tokenizer:
    name: pipeline_tokenizer
    image: clams-tokenizer
    repository: ../app-nlp-example
  spacy:
    name: pipeline_spacy
    image: clams-spacy-nlp
    repository: ../app-spacy-nlp

The docker compose file can be created from this file. Some extra machinery is needed if we want to add a container for the pipeline script, but maybe we want to do that by default.

marcverhagen commented 3 years ago

Hmmm, forgot for second that the order in the above yaml file does not matter. Better use:

mount: /local/data

services:

  - tokenizer:
      name: pipeline_tokenizer
      image: clams-tokenizer
      repository: ../app-nlp-example

  - spacy:
      name: pipeline_spacy
      image: clams-spacy-nlp
      repository: ../app-spacy-nlp
marcverhagen commented 3 years ago

Done as part of 7edb0566.