Open SooLee opened 6 years ago
Hello @SooLee . We should generate a better error in this instance. Can you share your file?
You may also find a solution via cwl-upgrader
@ https://github.com/common-workflow-language/cwl-upgrader which upgrades draft-3
to v1.0
.
Yes, I know. That tool doesn't work. It does not produce a proper json or yaml format file. The output of cwl-upgrader is not a cwl.
This resolved the error.
id
: #prefix.name
-> #prefix/name
for all STEP inputs and outputs and the source of all WORKFLOW outputs and all STEP inputsIt would be very useful to document all the differences between draft-3 and version 1.0, since cwltool doesn't support draft-3 any more. We're forced to convert and there is really no easy way. :(
@SooLee I am very interested in cwl-upgrader
having correct functionality. If you could provide me with the draft-3
workflow that was causing problems then I will fix cwl-upgrader
. If needed, you can email me directly: mrc@commonwl.org
We want upgrading from draft-3
to be easy :-)
I think the main problem is that it is not JSON. That is probably an easy fix. Below is an example output. It should first have 1) double quotes around field names and values and 2) outermost braces removed. It should also include change #5 that I mentioned above (i.e. '.' -> '/' in the id and source strings).
{class: Workflow, inputs: [{id: '#input_file', type: ['null', File]}], steps: [{id: '#md5',
out: [{id: '#md5.report'}], run: md5.cwl, in: [{source: '#input_file', id: '#md5.input_file'}]},
{id: '#validatefiles', out: [{id: '#validatefiles.report'}], run: validate.cwl,
in: [{source: '#input_file', id: '#validatefiles.input_file'}, {id: '#validatefiles.type'}]}],
outputs: [{type: ['null', File], id: '#validatefiles_report', outputSource: '#validatefiles.report'},
{type: ['null', File], id: '#md5_report', outputSource: '#md5.report'}], cwlVersion: v1.0}
The output of cwl-upgrader
is in YAML, the syntax of CWL since before draft-3
🙂
There may be a bug with the processing, but it will be much easier for me to fix it if I can examine the initial CWL draft-3
document
I see. I didn't know there was a one-liner version for YAML. The original cwl looks as below.
{
"class": "Workflow",
"inputs": [
{
"id": "#input_file",
"type": [
"null",
"File"
]
}
],
"steps": [
{
"id": "#md5",
"outputs": [
{
"id": "#md5.report"
}
],
"run": "md5.cwl",
"inputs": [
{
"source": "#input_file",
"id": "#md5.input_file"
}
]
},
{
"id": "#validatefiles",
"outputs": [
{
"id": "#validatefiles.report"
}
],
"run": "validate.cwl",
"inputs": [
{
"source": "#input_file",
"id": "#validatefiles.input_file"
},
{
"id": "#validatefiles.type"
}
]
}
],
"outputs": [
{
"type": [
"null",
"File"
],
"id": "#validatefiles_report",
"source": "#validatefiles.report"
},
{
"type": [
"null",
"File"
],
"id": "#md5_report",
"source": "#md5.report"
}
],
"cwlVersion": "draft-3",
"requirements": [
{
"class": "InlineJavascriptRequirement"
}
]
}
Ah, the strange output was because it wasn't recognized as a valid draft-3
document for some reason.
Skipping non draft-3 CWL document
It may be wrong. Was this written by hand, or is it from Seven Bridges or another source?
I see two deviations from the draft-3
syntax that is tripping up cwl-upgrader
:
"cwlVersion": "draft-3"
is missing the cwl
prefix from the version number; should be "cwlVersion": "cwl:draft-3"
validatefiles
step: validatefiles.type
"run": "validate.cwl",
"inputs": [
{
"source": "#input_file",
"id": "#validatefiles.input_file"
},
{
"id": "#validatefiles.type"
}
Using a new branch of cwl-upgrader
I now get the following output
class: Workflow
cwlVersion: v1.0
inputs:
- {id: input_file, type: 'File?'}
outputs:
- {id: validatefiles_report, outputSource: validatefiles.report, type: 'File?'}
- {id: md5_report, outputSource: md5.report, type: 'File?'}
requirements:
- {class: InlineJavascriptRequirement}
steps:
md5:
in:
input_file: {source: input_file}
out: [report]
validatefiles:
in:
input_file: {source: input_file}
out: [report]
Thanks for providing your workflow source. The cwl-upgrader
was mainly focused on draft-3
CommandLineTool
conversion, so I'm glad to have the opportunity to improve its Workflow
conversion.
It is a valid draft-3 cwl and we have been running it with an older version of cwltool with no problem. I noticed cwl-upgrader sometimes converts it to v1.0 as I posted above and sometimes it doesn't convert and gives you that 'skipping non draft-3 cwl document' message, on the same file. Not sure what I did differently.
I wrote a converter myself in the mean time that implements changes in above #1~#5 + 'description'->'doc') but probably won't work comprehensively on all draft-3 cwl files. Works for our cwl files.
I just saw your last message. Let me test it.
Yes, I'm loosening the requirement about cwl:draft-3
in https://github.com/common-workflow-language/cwl-upgrader/pull/16 (when I ran it earlier I manually added the cwl
prefix in)
The conversion was successful (looks much nicer too :)) but when I ran it using the latest version of cwltool I get the following error.
Traceback (most recent call last):
File "/home/ec2-user/venv/cwl/local/lib/python2.7/site-packages/cwltool/main.py", line 469, in main
do_validate=args.do_validate)
File "/home/ec2-user/venv/cwl/local/lib/python2.7/site-packages/cwltool/load_tool.py", line 259, in validate_document
"\n{}".format("\n".join(versions)))
ValidationException: The CWL reference runner no longer supports pre CWL v1.0 documents. Supported versions are:
v1.0
v1.1.0-dev1 (with --enable-dev flag only)
Actually, it didn't really convert. The output is still draft-3
class: Workflow
cwlVersion: draft-3
inputs:
- id: '#input_file'
type: ['null', File]
outputs:
- id: '#validatefiles_report'
source: '#validatefiles.report'
type: ['null', File]
- id: '#md5_report'
source: '#md5.report'
type: ['null', File]
requirements:
- {class: InlineJavascriptRequirement}
steps:
- id: '#md5'
inputs:
- {id: '#md5.input_file', source: '#input_file'}
outputs:
- {id: '#md5.report'}
run: md5.cwl
- id: '#validatefiles'
inputs:
- {id: '#validatefiles.input_file', source: '#input_file'}
- {id: '#validatefiles.type'}
outputs:
- {id: '#validatefiles.report'}
run: validate.cwl
Okay, I see the error
Traceback (most recent call last):
File "/home/ec2-user/venv/cwl/bin/cwl-upgrader", line 11, in <module>
sys.exit(main())
File "/home/ec2-user/venv/cwl/local/lib/python2.7/site-packages/cwlupgrader/main.py", line 17, in main
draft3_to_v1_0(document)
File "/home/ec2-user/venv/cwl/local/lib/python2.7/site-packages/cwlupgrader/main.py", line 24, in draft3_to_v1_0
_draft3_to_v1_0(document)
File "/home/ec2-user/venv/cwl/local/lib/python2.7/site-packages/cwlupgrader/main.py", line 45, in _draft3_to_v1_0
inp["source"] = inp["source"].lstrip('#')
KeyError: 'source'
I will test more later.
That error is due to the extra input with no source in the validatefiles
step: validatefiles.type
"run": "validate.cwl",
"inputs": [
{
"source": "#input_file",
"id": "#validatefiles.input_file"
},
{
"id": "#validatefiles.type"
}
You'll need to remove
,
{
"id": "#validatefiles.type"
}
by hand to get it to work.
I thought you loosened it, just before this change (what happened?). It works for most other cwl files we have, except the ones with items without source.
This is with a different workflow, containing scatter. The converted cwl had a null scatter. old:
"scatter": "#pairsam-parse-sort.bam"
new:
scatter: ''
As a result, when I ran the cwl, I get the following error.
Traceback (most recent call last):
File "/home/ec2-user/venv/cwl/local/lib/python2.7/site-packages/cwltool/main.py", line 469, in main
do_validate=args.do_validate)
File "/home/ec2-user/venv/cwl/local/lib/python2.7/site-packages/cwltool/load_tool.py", line 276, in validate_document
workflowobj, fileuri, checklinks=do_validate)
File "/home/ec2-user/venv/cwl/src/schema-salad/schema_salad/ref_resolver.py", line 900, in resolve_all
self.validate_links(document, u"", all_doc_ids)
File "/home/ec2-user/venv/cwl/src/schema-salad/schema_salad/ref_resolver.py", line 1060, in validate_links
raise errors[0]
Actually, about the above case, it may be due to having '.' instead of '/' (though I think scatter also needs to be fixed)
in:
input_pairsams: {fdn_format: pairsam, source: pairsam-parse-sort.sorted_pairsam}
ValidationException: hi-c-processing-bam.cwl:41:1: checking field `steps`
hi-c-processing-bam.cwl:42:3: checking object `hi-c-processing-bam.cwl#pairsam-filter`
hi-c-processing-bam.cwl:47:5: checking field `in`
hi-c-processing-bam.cwl:49:7: checking object `hi-c-processing-bam.cwl#pairsam-filter/input_pairsam`
hi-c-processing-bam.cwl:49:44: Field `source` references unknown identifier `pairsam-markasdup.dupmarked_pairsam`, tried file:///home/ec2-user/pipelines-cwl/cwl_awsem_v1/hi-c-processing-bam.cwl#pairsam-markasdup.dupmarked_pairsam
hi-c-processing-bam.cwl:52:3: checking object `hi-c-processing-bam.cwl#pairsam-markasdup`
hi-c-processing-bam.cwl:57:5: checking field `in`
hi-c-processing-bam.cwl:58:7: checking object `hi-c-processing-bam.cwl#pairsam-markasdup/input_pairsam`
hi-c-processing-bam.cwl:58:44: Field `source` references unknown identifier `pairsam-merge.merged_pairsam`, tried file:///home/ec2-user/pipelines-cwl/cwl_awsem_v1/hi-c-processing-bam.cwl#pairsam-merge.merged_pairsam
hi-c-processing-bam.cwl:61:3: checking object `hi-c-processing-bam.cwl#pairsam-merge`
hi-c-processing-bam.cwl:66:5: checking field `in`
hi-c-processing-bam.cwl:67:7: checking object `hi-c-processing-bam.cwl#pairsam-merge/input_pairsams`
hi-c-processing-bam.cwl:67:45: Field `source` references unknown identifier `pairsam-parse-sort.sorted_pairsam`, tried file:///home/ec2-user/pipelines-cwl/cwl_awsem_v1/hi-c-processing-bam.cwl#pairsam-parse-sort.sorted_pairsam
Sorry for losing track of this thread @SooLee
This is with a different workflow, containing scatter. The converted cwl had a null scatter.
Do you happen to still have the original file around?
I converted a working draft-3 CWL file to a version 1.0 CWL file by changing the following.
"cwlVersion": "draft-3",
->"cwlVersion": "v1.0",
"inputs"
->"in"
for all STEP inputs (but not workflow inputs)"outputs"
->"out"
for all STEP outputs (but not workflow outputs)"source"
->"outputSource"
for all workflow outputs (but not step inputs)(My file doesn't have a field 'description')
I'm getting the following error when running the latest master of cwltool.
I'm not sure if I can get what exactly is wrong with this CWL file. Some help would be appreciated. Thanks.