Open alexiswl opened 3 years ago
Hey @alexiswl ; can you put an example packed workflow that exhibits this issue on https://gist.github.com/ or similar and drop the link here?
@alexiswl FYI, that file has id
s in its custom types, that is not formally part of the CWL standard: https://www.commonwl.org/v1.2/CommandLineTool.html#CommandInputRecordSchema
Hi @mr-c, do you know why this might be? The raw yaml is now publicly accessible at https://github.com/umccr/cwl-ica/blob/main/workflows/bcl-conversion/3.7.5/bcl-conversion__3.7.5.cwl
None of the schemas present have the id
attribute in them either:
fastq-list-row
: https://github.com/umccr/cwl-ica/blob/main/schemas/fastq-list-row/1.0.0/fastq-list-row__1.0.0.yamlsettings-by-samples
: https://github.com/umccr/cwl-ica/blob/main/schemas/settings-by-samples/1.0.0/settings-by-samples__1.0.0.yamlAt the moment, in order to import these workflows that contain schemas through the CWL parser, I have to first import the schema object and then manually append the schema object to the namespace.
See:
https://github.com/umccr/cwl-ica/blob/main/src/classes/cwl.py#L135-L154
For packed cwl files this would be a little more difficult for I need to first find the SchemaDefRequirement
inside the graph and add them to the $namespaces
attribute of the graph.
I guess something like so would be a possible way to grab the schemas required for the workflow.
$ cwltool --pack bcl-conversion__3.7.5.cwl | \
jq --raw-output '.["$graph"][-1].requirements[] | select(.class=="SchemaDefRequirement") | .types[] | .["$import"]'
#settings-by-samples__1.0.0.yaml
#fastq-list-row__1.0.0.yaml
Where the jq
component of this would be done in python.
Still, it nonetheless seems quite hacky that this is a requirement.
@alexiswl As you can see, your helpful example has launched many fixed to cwltool --pack
, the code in schema_salad that produces the parsers, and the schema of the CWL standards themselves (!).
Ultimately (when all is done, merged, and released) the answer to your question will be "Load the packed document like any other." :-)
FYI, here is my variation on your testing script
"""
Import a cwl file as a parser object
"""
import sys
from pathlib import Path
from schema_salad.utils import yaml_no_ts
# ^^ requires schema_salad >= 8.2
# does preserve_quotes=True and more
# Set path
cwl_file_path = Path(sys.argv[1])
# Load file as yaml dict
# Read in the cwl file from a json/yaml
with open(cwl_file_path, "r") as cwl_h:
cwl_file_yaml = yaml_no_ts().load(cwl_h)
# Conditional import based on cwl version
if 'cwlVersion' not in cwl_file_yaml:
print("Error - could not get the cwlVersion")
sys.exit(1)
# Import parser based on CWL Version
if cwl_file_yaml['cwlVersion'] == 'v1.0':
from cwl_utils import parser_v1_0 as parser
elif cwl_file_yaml['cwlVersion'] == 'v1.1':
from cwl_utils import parser_v1_1 as parser
elif cwl_file_yaml['cwlVersion'] == 'v1.2':
from cwl_utils import parser_v1_2 as parser
else:
print("Version error. Did not recognise {} as a CWL version".format(yaml_obj["CWLVersion"]))
sys.exit(1)
doc = parser.load_document_by_yaml(cwl_file_yaml, cwl_file_path.absolute().as_uri())
Thanks for this @mr-c! I appreciate the feedback and very happy to know that this has fixed multiple parts!
Do you recommend the yaml_no_ts
from https://github.com/common-workflow-language/schema_salad/blob/main/schema_salad/utils.py#L133 over ruamel's 'round-trip-load' from https://sourceforge.net/p/ruamel-yaml/code/ci/default/tree/main.py#l1132 ?
Is the only difference the loading of timestamps?
Is the only difference the loading of timestamps?
Correct. Probably not needed in your case
@alexiswl Can you try packing with https://github.com/rabix/sbpack ?
Thanks for the suggestion @mr-c, looks like this would handle most of the workarounds we're currently doing. Is there a 'local-only' functionality of this tool / a way to import a local packed file? We don't use the Seven Bridges endpoint.
Oh, I should have been more specific! It includes a local only tool named cwlpack
Hello,
Been playing around with how to import a packed cwl json file as a CWL parser object.
Here are my steps so far
Setup
First attempt:
SchemaSaladException: Cannot load $import without fileuri
Third attempt
ValidationException: - tried _RecordLoader but Expected a dict
Fourth attempt
ValidationException: - tried _RecordLoader but Expected a dict
Is this due to my workflow being a little bit too complicated for the parser and using record schemas?