viash-io / viash

script + metadata = standalone component
https://viash.io
GNU General Public License v3.0
39 stars 2 forks source link

Refactor __includes__ using custom yaml tags #288

Open rcannood opened 1 year ago

rcannood commented 1 year ago

circe uses snakeyaml to parse YAML into Json. snakeyaml support YAML 1.1, and thus by extension viash can use achors and aliases.

For example:

functionality:
  name: &name foo
  namespace: *name
  arguments:
    - &input
      type: file
      name: --input
      description: A h5ad file
      example: file.h5ad
      required: true
    - <<: *input
      name: --output
      direction: output

Is the same as:

functionality:
  name: foo
  namespace: foo
  arguments:
    - type: file
      name: --input
      description: A h5ad file
      example: file.h5ad
      required: true
    - type: file
      name: --output
      description: A h5ad file
      example: file.h5ad
      required: true
      direction: output

snakeyaml also supports tags, which could be used to replace the __include__: base.yaml functionality we currently support. Concretely, we could support the following functionality:

functionality:
  name: &name foo
  namespace: *name
  arguments:
    - name: --input
      !include file_format.yaml
    - name: --output
      direction: output
      !include file_format.yaml

However, at the moment circe processes custom tags by turning them into attributes (see here) and does not allow for custom tag handlers. We could try to fork circe-yaml to allow this functionality.

rcannood commented 1 year ago

Edit: the last codeblock is not valid yaml, it seems I misunderstood the yaml tag spec ;)

Tags can only be used to annotate values, e.g.

functionality:
  name: &name foo
  namespace: *name
  resources:
    - type: r_script
      path: !file script.R

If tags would have been used in a Viash config, it would have been used to represent what we currently use the type field for, like so:

functionality:
  name: foo
  arguments:
    input: !string
      default: foo
    output: !file
      default: file.txt
  resources:
    - !r_script script.R
    - resources.txt