reanahub / reana-client

REANA command-line client
http://reana-client.readthedocs.io/
MIT License
10 stars 44 forks source link

validate: detecting misplaced environment clause for Serial workflows #679

Closed tiborsimko closed 7 months ago

tiborsimko commented 8 months ago

The following Serial workflow is wrong; note the misplaced "environment" clause for the first step:

version: 0.6.0
inputs:
  files:
    - code/gendata.C
    - code/fitdata.C
  parameters:
    events: 20000
    data: results/data.root
    plot: results/plot.png
workflow:
  type: serial
  specification:
    environment:
      name: 'docker.io/reanahub/reana-env-root6:6.18.04'
    steps:
      - name: gendata
        kubernetes_memory_limit: '256Mi'
        commands:
        - mkdir -p results && root -b -q 'code/gendata.C(${events},"${data}")'
      - name: fitdata
        environment: 'docker.io/reanahub/reana-env-root6:6.18.04'
        kubernetes_memory_limit: '256Mi'
        commands:
        - root -b -q 'code/fitdata.C("${data}","${plot}")'
outputs:
  files:
    - results/plot.png

However it appears to pass the validation:

$ reana-client validate
==> Verifying REANA specification file... reana.yaml
  -> SUCCESS: Valid REANA specification file.
==> Verifying REANA specification parameters...
  -> SUCCESS: REANA specification parameters appear valid.
==> Verifying workflow parameters and commands...
  -> SUCCESS: Workflow parameters and commands appear valid.
==> Verifying dangerous workflow operations...
  -> SUCCESS: Workflow operations appear valid.

And the execution obviously fails:

$ reana-client logs -w test

==> Workflow engine logs
Workflow exited unexpectedly.
'environment'
tiborsimko commented 7 months ago

1) If we go for fuller validation of workflow.specification.steps for Serial workflows, then here is I think the fairly complete list of clauses that we allow in the Serial steps specification:

If you encounter something else, then a WARNING should be emitted. (But no ERROR so that we can add new optional keywords to the list and so that old clients would work with newer servers.)

2) We could also configure a list of keywords that should not be there in workflow.specification.steps for the Serial workflow engine in order to catch some common troubles:

3) We could also validate top-level workflow.resources clause that may be empty or may contain one of the following:

giuseppe-steduto commented 7 months ago

Great list, thanks!

  • kubernetes_job_timeout: string

Since it's the number of seconds, this should be an integer, right? (https://docs.reana.io/advanced-usage/compute-backends/kubernetes/#custom-job-timeouts)