common-workflow-language / cwljava

Java SDK for the Common Workflow Language standards
12 stars 8 forks source link

WorkflowStepInput source IDs incorrect when parsing workflow packed with cwlpack #61

Closed jdidion closed 2 years ago

jdidion commented 2 years ago

I used cwlpack to pack the conformance test scatter-wf4. The resulting packed workflow (shown below) validates with cwltool --validate. However, when I parse the packed workflow using cwljava, the WorkflowStepInput sources have the form "echo_in1/inp1" when they should be either "main/inp1" or just "inp1".

{
    "cwlVersion": "v1.2",
    "$graph": [
        {
            "id": "echo",
            "class": "CommandLineTool",
            "inputs": {
                "echo_in1": {
                    "type": "string",
                    "inputBinding": {}
                },
                "echo_in2": {
                    "type": "string",
                    "inputBinding": {}
                }
            },
            "outputs": {
                "echo_out": {
                    "type": "string",
                    "outputBinding": {
                        "glob": "step1_out",
                        "loadContents": true,
                        "outputEval": "$(self[0].contents)"
                    }
                }
            },
            "baseCommand": "echo",
            "arguments": [
                "-n",
                "foo"
            ],
            "stdout": "step1_out"
        },
        {
            "id": "main",
            "class": "Workflow",
            "inputs": {
                "inp1": "string[]",
                "inp2": "string[]"
            },
            "requirements": [
                {
                    "class": "ScatterFeatureRequirement"
                }
            ],
            "steps": {
                "step1": {
                    "scatter": [
                        "echo_in1",
                        "echo_in2"
                    ],
                    "scatterMethod": "dotproduct",
                    "in": {
                        "echo_in1": "inp1",
                        "echo_in2": "inp2"
                    },
                    "out": [
                        "echo_out"
                    ],
                    "run": "#echo"
                }
            },
            "outputs": [
                {
                    "id": "out",
                    "outputSource": "step1/echo_out",
                    "type": {
                        "type": "array",
                        "items": "string"
                    }
                }
            ]
        }
    ],
    "inputs": [],
    "outputs": [],
    "requirements": [
        {
            "class": "InlineJavascriptRequirement"
        }
    ]
}
mr-c commented 2 years ago

On second look, the cwlpack output is a bit wonky; can you try again using https://github.com/rabix/sbpack/pull/25 ?

jdidion commented 2 years ago

There is no difference in the packed workflow between main and #25

mr-c commented 2 years ago

There is no difference in the packed workflow between main and #25

Without the PR, cwlpack adds the following to the top level dictionary, which should only contain $graph and cwlVersion

inputs": [],
    "outputs": [],
    "requirements": [
        {
            "class": "InlineJavascriptRequirement"
        }
    ]
jdidion commented 2 years ago

You're right. It was picking up a different version of cwlpack on my path. It does not fix the cwljava issue though - the step input ID is still being parsed as echo_in1/inp1.

mr-c commented 2 years ago

Now that cwlpack is fixed, I can confirm that the same error occurs even without cwlpack as the original document is already packed. Python codegen doesn't not exhibit this behaviour.