common-workflow-language / cwltool

Common Workflow Language reference implementation
https://cwltool.readthedocs.io/
Apache License 2.0
332 stars 230 forks source link

Regression in handling enum type option #704

Closed partha-edico closed 5 years ago

partha-edico commented 6 years ago

Expected Behavior

cwltool should handle an input type such as the following: type:

Actual Behavior

As of release 1.0.20180322194411 (NOTE: Previous release 1.0.20180306163216 worked OK), we get cwltool parsing errors: => cwl_test.yaml:13:7: invalid field name, expected one of: 'symbols', 'type', 'label', 'inputBinding'

Workflow Code

cwl_tool.yaml

baseCommand: /bin/ls
class: CommandLineTool
cwlVersion: v1.0
hints: []
requirements:
- {class: InlineJavascriptRequirement}
inputs:
  ls_format:
    type:
    - 'null'
    - type: enum
      symbols: ['across','commas','long']
    label: Format type
    inputBinding: {position: 3, prefix: --format, itemSeparator: '='}
outputs:
  bam_output:
    type: File?
    outputBinding: {glob: '*.bam'}
stderr: stderr.txt
stdout: stdout.txt

test_input.yaml

ls_format: 'long'

Full Traceback

No exception

Your Environment

tetron commented 6 years ago

Is this a fatal error or a warning?

partha-edico commented 6 years ago

Hi Peter, This is not a fatal eror. cwltool keeps running and doesn't exit. However, there seem to be binding errors that have a ripple effect that prevents my full CWL from being properly parsed. The example I gave is a very simplistic one that anyone can run. When I try to run our tool (called "dragen") I get the error below:

Resolved '/staging/eTES/working/cwl_tool.yaml' to 'file:///staging/eTES/working/cwl_tool.yaml'
../working/cwl_tool.yaml:108:13: invalid field `name`, expected one of: 'symbols', 'type', 'label', 'inputBinding'
../working/cwl_tool.yaml:108:13: invalid field `name`, expected one of: 'symbols', 'type', 'label', 'inputBinding'
../working/cwl_tool.yaml:121:13: invalid field `name`, expected one of: 'symbols', 'type', 'label', 'inputBinding'
../working/cwl_tool.yaml:121:13: invalid field `name`, expected one of: 'symbols', 'type', 'label', 'inputBinding'
../working/cwl_tool.yaml:133:13: invalid field `name`, expected one of: 'symbols', 'type', 'label', 'inputBinding'
../working/cwl_tool.yaml:133:13: invalid field `name`, expected one of: 'symbols', 'type', 'label', 'inputBinding'
../working/cwl_tool.yaml:198:13: invalid field `name`, expected one of: 'symbols', 'type', 'label', 'inputBinding'
../working/cwl_tool.yaml:198:13: invalid field `name`, expected one of: 'symbols', 'type', 'label', 'inputBinding'
../working/cwl_tool.yaml:229:13: invalid field `name`, expected one of: 'symbols', 'type', 'label', 'inputBinding'
../working/cwl_tool.yaml:229:13: invalid field `name`, expected one of: 'symbols', 'type', 'label', 'inputBinding'
../working/cwl_tool.yaml:243:13: invalid field `name`, expected one of: 'symbols', 'type', 'label', 'inputBinding'
../working/cwl_tool.yaml:243:13: invalid field `name`, expected one of: 'symbols', 'type', 'label', 'inputBinding'
../working/cwl_tool.yaml:256:13: invalid field `name`, expected one of: 'symbols', 'type', 'label', 'inputBinding'
../working/cwl_tool.yaml:256:13: invalid field `name`, expected one of: 'symbols', 'type', 'label', 'inputBinding'
../working/cwl_tool.yaml:269:13: invalid field `name`, expected one of: 'symbols', 'type', 'label', 'inputBinding'
../working/cwl_tool.yaml:269:13: invalid field `name`, expected one of: 'symbols', 'type', 'label', 'inputBinding'
../working/cwl_tool.yaml:282:13: invalid field `name`, expected one of: 'symbols', 'type', 'label', 'inputBinding'
../working/cwl_tool.yaml:282:13: invalid field `name`, expected one of: 'symbols', 'type', 'label', 'inputBinding'
../working/cwl_tool.yaml:303:13: invalid field `name`, expected one of: 'symbols', 'type', 'label', 'inputBinding'
../working/cwl_tool.yaml:303:13: invalid field `name`, expected one of: 'symbols', 'type', 'label', 'inputBinding'
Got workflow error
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/cwltool/executors.py", line 99, in run_jobs
    for r in jobiter:
  File "/usr/lib/python2.7/site-packages/cwltool/command_line_tool.py", line 341, in job
    builder = self._init_job(job_order, **kwargs)
  File "/usr/lib/python2.7/site-packages/cwltool/process.py", line 595, in _init_job
    builder.bindings.extend(builder.bind_input(self.inputs_record_schema, builder.job, discover_secondaryFiles=kwargs.get("toplevel")))
  File "/usr/lib/python2.7/site-packages/cwltool/builder.py", line 183, in bind_input
    bindings.extend(self.bind_input(f, datum[f["name"]], lead_pos=lead_pos, tail_pos=f["name"], discover_secondaryFiles=discover_secondaryFiles))
  File "/usr/lib/python2.7/site-packages/cwltool/builder.py", line 159, in bind_input
    return self.bind_input(schema, datum, lead_pos=lead_pos, tail_pos=tail_pos, discover_secondaryFiles=discover_secondaryFiles)
  File "/usr/lib/python2.7/site-packages/cwltool/builder.py", line 175, in bind_input
    bindings.extend(self.bind_input(st, datum, lead_pos=lead_pos, tail_pos=tail_pos, discover_secondaryFiles=discover_secondaryFiles))
  File "/usr/lib/python2.7/site-packages/cwltool/builder.py", line 201, in bind_input
    self.bind_input(itemschema, item, lead_pos=n, tail_pos=tail_pos, discover_secondaryFiles=discover_secondaryFiles))
  File "/usr/lib/python2.7/site-packages/cwltool/builder.py", line 240, in bind_input
    checkFormat(datum, self.do_eval(schema["format"]), self.formatgraph)
  File "/usr/lib/python2.7/site-packages/cwltool/builder.py", line 79, in checkFormat
    raise validate.ValidationException(u"Missing required 'format' for File %s" % af)
ValidationException: Missing required 'format' for File ordereddict([('class', 'File'), ('location', 'file:///staging/eTES/inputs/54321_ATGTCA_L001_R1_001.fastq.gz'), ('size', 376225573), ('basename', '54321_ATGTCA_L001_R1_001.fastq.gz'), ('nameroot', '54321_ATGTCA_L001_R1_001.fastq'), ('nameext', '.gz')])
Workflow error, try again with --debug for more information:
Missing required 'format' for File ordereddict([('class', 'File'), ('location', 'file:///staging/eTES/inputs/54321_ATGTCA_L001_R1_001.fastq.gz'), ('size', 376225573), ('basename', '54321_ATGTCA_L001_R1_001.fastq.gz'), ('nameroot', '54321_ATGTCA_L001_R1_001.fastq'), ('nameext', '.gz')])

Again, this worked fine in the preceding release, and I noticed there has been changes made to bind_input() function in commits such as 5e3947c822f6ad3a10880ea042a75a86db9da50c

tetron commented 6 years ago

This looks like two separate problems. The "name" error is a bug in the cwl v1.0 schema that seems to have been uncovered by some other changes.

The "format" error message is much too cryptic, but I think it is telling you that there is a field in your input definition which specifies the input should have a certain file format, but the input file is missing a "format" field. This wasn't an error before because it wasn't being checked as intended before, due to a bug.

Can you supply you files or a reproducible test case?

partha-edico commented 6 years ago

Hi Peter, I greatly appreciate your prompt help! It makes sense now that these two are different unrelated issues. 1) I don't understand the "name" error bug you refer to? The Invalid field error says "name" instead of "type", "symbols", etc. But the CWL I provided has 'type' and 'symbols' and no field that says 'name'. How would I fix the CWL to avoid this error?

2) You are correct that my job input file was missing the "format" field. When I insert this into my input description, I do not get this Validation exception and the job runs as expected. I guess this missing field was never checked before, but now it appears it is mandatory to supply it? The version 1.0 standard seems a bit unclear that this was needed (perhaps I misunderstood).

Thanks Again!

tetron commented 6 years ago

You're right, it is being added internally and while it is necessary to satisfy the component, it definitely shouldn't be producing warnings like that. That's a bug.

The format is required when you have specified a format for an input field. It is doing correctness checking that an input file is a file format that makes sense for the tool. You either need to annotate the file correctly in the input document or remove the format constraint from the tool definition.

partha-edico commented 6 years ago

OK, thank you again for the clarification Peter. :-)

mr-c commented 5 years ago

I am no longer able to reproduce this issue with cwltool so it looks like this is fixed; thank you @partha-edico for the report!