common-workflow-language / cwltool

Common Workflow Language reference implementation
https://cwltool.readthedocs.io/
Apache License 2.0
332 stars 230 forks source link

NetworkAccess requirement not applicable in cwltool:overrides #1754

Open alexiswl opened 1 year ago

alexiswl commented 1 year ago

Expected Behavior

Allow user to specify network access in cwltool:overrides OR explictly error out saying this is not an acceptable override.

Actual Behavior

Tell us what happens instead

We get a warning at the top of the script

overrides.json:3:5: Warning: checking item
overrides.json:4:7: Warning:   checking field `requirements`
overrides.json:5:9: Warning:     checking item
                    Warning:       Field `class` contains undefined reference to
                    `file:///home/alexiswl/170612_A00181_0011_AH2JK7DMXX_testing/NetworkAccess`

Input Json

{
  "bclconvert_run_configurations": [
    {
      "strict_mode": false,
      "output_directory": "170612_A00181_0011_AH2JK7DMXX_converted",
      "samplesheet": {
        "bclconvert_data": [
          {
            "index": "",
            "lane": 1,
            "sample_id": "PhiX"
          }
        ],
        "bclconvert_settings": {
          "override_cycles": "Y4N147;Y4N147"
        },
        "header": {
          "application": "NovaSeq FASTQ Only",
          "assay": "TruSeq HT",
          "chemistry": "Default",
          "date": "6/12/2017",
          "experiment_name": "170612",
          "file_format_version": 2,
          "iem_file_version": 4,
          "index_adapters": "Truseq",
          "instrument_type": "NovaSeq 6000",
          "workflow": "GenerateFASTQ"
        },
        "reads": {
          "read_1_cycles": 151,
          "read_2_cycles": 151
        }
      }
    }
  ],
  "bclconvert_run_input_directory": {
    "class": "Directory",
    "location": "170612_A00181_0011_AH2JK7DMXX/"
  },
  "runfolder_name": "170612_A00181_0011_AH2JK7DMXX",
  "cwltool:overrides": {
    "workflows/bclconvert/4.0.3/bclconvert__4.0.3.cwl#bcl_convert_run_step": {
      "requirements": {
        "NetworkAccess": {
          "networkAccess": true
        }
      }
    }
  }
}

Full Traceback

INFO /home/alexiswl/miniconda3/envs/cwltool-latest/bin/cwltool 3.1.20221018083734
INFO Resolved 'bclconvert-with-qc-pipeline__4.0.3/workflow.cwl' to 'file:///home/alexiswl/170612_A00181_0011_AH2JK7DMXX_testing/bclconvert-with-qc-pipeline__4.0.3/workflow.cwl'
input.json:42:5: Warning: checking item
input.json:43:7: Warning:   checking field `requirements`
input.json:44:9: Warning:     checking item
                 Warning:       Field `class` contains undefined reference to
                 `file:///home/alexiswl/170612_A00181_0011_AH2JK7DMXX_testing/NetworkAccess`
...

Your Environment

cwltool 3.1.20221018083734
mr-c commented 1 year ago

Hello @alexiswl

What is the cwlVersion of bclconvert-with-qc-pipeline__4.0.3/workflow.cwl ? (Bonus if you can share a link to that as well)

alexiswl commented 1 year ago

Version of the workflow is v1.1

Pipeline is here: https://github.com/umccr/cwl-ica/tree/bclconvert-4.0.3/workflows/bclconvert-with-qc-pipeline/4.0.3 (it's on a branch though so will zip up a copy for you).

I've added Network Access to the tool directly now but should only be necessary when the container is a dragen fpga container

bclconvert-with-qc-pipeline__4.0.3.zip

alexiswl commented 1 year ago

I can replicate this with a simple tabix workflow if that is going to be easier

mr-c commented 1 year ago
$ cwltool workflows/bclconvert/4.0.3/bclconvert__4.0.3.cwl 1754-inputs.json 
INFO /home/michael/cwltool/env3.10/bin/cwltool 3.1.20221018083734
INFO Resolved 'workflows/bclconvert/4.0.3/bclconvert__4.0.3.cwl' to 'file:///home/michael/cwltool/1754/bclconvert-with-qc-pipeline__4.0.3/workflows/bclconvert/4.0.3/bclconvert__4.0.3.cwl'
ERROR Tool definition failed validation:
1754-inputs.json:1:1:   while parsing a flow mapping
1754-inputs.json:41:23:   expected ',' or '}', but got '{'
alexiswl commented 1 year ago

ERROR Tool definition failed validation: 1754-inputs.json:1:1: while parsing a flow mapping 1754-inputs.json:41:23: expected ',' or '}', but got '{'

Apologies, have updated the input json above to be a valid json

mr-c commented 1 year ago

No worries @alexiswl , I eventually figured it out. Does #1755 fix it for you?

alexiswl commented 1 year ago

Hi Michael,

Yes the error goes away but --net isn't updated. It seems I'm having a little issue setting the overrides on this one.

We've been using cwltool version 3.0.20201203173111 in production but I got the same warning results above using the most recent stable version (which is why I put 3.1.20221018083734 above).

When setting the overrides I specify the path to the subworkflow (relative to the main workflow) that invoked the tool I'd like to override, followed by a #, followed by the step id of that tool. This is consistent with the name used for the step inside cwltool --debug logs, the step is referenced as file:///home/alexiswl/170612_A00181_0011_AH2JK7DMXX_testing/bclconvert-with-qc-pipeline__4.0.3/workflows/bclconvert/4.0.3/bclconvert__4.0.3.cwl#bcl_convert_run_step, the workflow.cwl is in /home/alexiswl/170612_A00181_0011_AH2JK7DMXX_testing/bclconvert-with-qc-pipeline__4.0.3 so I set the overrides key as workflows/bclconvert/4.0.3/bclconvert__4.0.3.cwl#bcl_convert_run_step.

This works in 3.0.20201203173111 (I've tried with a standard DockerRequirement override) but not in the most stable version (downloaded from pip cwltool 3.1.20221018083734) nor when installed from source with the overrides-v1_1+ branch.

Update

I needed to preface the step with the id of the workflow THEN place in the name of the step

So the overrides key workflows/bclconvert/4.0.3/bclconvert__4.0.3.cwl#bcl_convert_run_step should be workflows/bclconvert/4.0.3/bclconvert__4.0.3.cwl#bclconvert--4.0.3/bcl_convert_run_step

alexiswl commented 1 year ago

@mr-c have retested this with the overrides v1_1+ branch with the updated overrides and can confirm this does work. Happy to close

alexiswl commented 1 year ago

I would like to point out however, there appears to be a discrepancy between cwltool version 3.0.20201203173111 and cwltool version 3.1.20221018083734 as the ID required for a cwltool overrides.

In this case workflows/bclconvert/4.0.3/bclconvert__4.0.3.cwl#bcl_convert_run_step worked for 2020 but not for 2022 and workflows/bclconvert/4.0.3/bclconvert__4.0.3.cwl#bclconvert--4.0.3/bcl_convert_run_step didn't work for 2020 but did work for 2022.

mr-c commented 1 year ago

I would like to point out however, there appears to be a discrepancy between cwltool version 3.0.20201203173111 and cwltool version 3.1.20221018083734 as the ID required for a cwltool overrides.

In this case workflows/bclconvert/4.0.3/bclconvert__4.0.3.cwl#bcl_convert_run_step worked for 2020 but not for 2022 and workflows/bclconvert/4.0.3/bclconvert__4.0.3.cwl#bclconvert--4.0.3/bcl_convert_run_step didn't work for 2020 but did work for 2022.

Thanks for the update. Yes, there was a fix to how various items in the workflow DAG are identified. Are there examples in the readme or other docs that need updating?

alexiswl commented 1 year ago

Thanks @mr-c

Are there examples in the readme or other docs that need updating?

I think the documentation here does make sense for users who already understand the pattern, however it isn't easy to take the example used and apply it to a workflow with greater complexity, such as one with subworkflows. The documentation reflects the DAG identification for the latest CWLtool version, for the 2020 version currently used in our production systems I've noted this discrepancy in our docs.

Personally, I used the CWL debug logs to help navigate how step IDs are set for use in overrides.

mr-c commented 1 year ago

Thanks @mr-c

Are there examples in the readme or other docs that need updating?

I think the documentation here does make sense for users who already understand the pattern, however it isn't easy to take the example used and apply it to a workflow with greater complexity, such as one with subworkflows. The documentation reflects the DAG identification for the latest CWLtool version, for the 2020 version currently used in our production systems I've noted this discrepancy in our docs.

Personally, I used the CWL debug logs to help navigate how step IDs are set for use in overrides.

I think a --print-overrides-targets option is needed; users shouldn't have to dig through the logs