galaxyproject / gxformat2

Galaxy Workflow Format 2
MIT License
9 stars 5 forks source link

CWL export validation not as strict as cwltool's #51

Open simleo opened 3 years ago

simleo commented 3 years ago

It was recently found that CWL abstract workflows generated with gxformat2 fail to validate with cwltool --validate (see https://github.com/ResearchObject/ro-crate-py/issues/33 for some background). However, test_export_abstract.py succeeds, despite using cwltool.main.resolve_and_validate_document. The reason is that cwltool --validate does not stop after calling resolve_and_validate_document. In particular, it calls make_tool, which tries to build a Workflow object and triggers this validation error.

To make the validation in the test as strict as cwltool --validate, the following change could be applied:

diff --git a/tests/test_export_abstract.py b/tests/test_export_abstract.py
index ae6c382..fe43171 100644
--- a/tests/test_export_abstract.py
+++ b/tests/test_export_abstract.py
@@ -11,6 +11,7 @@ from cwltool.main import (
     resolve_and_validate_document,
     tool_resolver,
 )
+from cwltool.load_tool import make_tool

 from gxformat2._yaml import ordered_dump, ordered_load
 from gxformat2.abstract import CWL_VERSION, from_dict
@@ -130,6 +131,7 @@ def _run_example(as_dict, out="test.cwl"):
         workflowobj,
         uri,
     )
+    make_tool(uri, loadingContext)
     return abstract_as_dict
mr-c commented 3 years ago

I would accept a PR to cwltool that provides a single call that does what you need :-)

simleo commented 3 years ago

I would accept a PR to cwltool that provides a single call that does what you need :-)

If you want to modify cwltool in light of this, I think it's a matter of what you would like the new semantics to be. I.e., do you think the current resolve_and_validate_document is somehow "incomplete", in the sense that it should also include the call to make_tool, which provides additional validation? Then the change could be something like:

diff --git a/cwltool/load_tool.py b/cwltool/load_tool.py
index 34dcf122..46bce6c4 100644
--- a/cwltool/load_tool.py
+++ b/cwltool/load_tool.py
@@ -425,8 +425,8 @@ def resolve_and_validate_document(
         visit_class(
             processobj, ("CommandLineTool", "Workflow", "ExpressionTool"), update_index
         )
-
-    return loadingContext, uri
+    tool = make_tool(uri, loadingContext)
+    return loadingContext, uri, tool

With consequent changes in all the code that uses resolve_and_validate_document. This would be backwards incompatible though.

Or would you rather have a new function that calls resolve_and_validate_document and then make_tool in sequence? If so, how would you name it?

mr-c commented 3 years ago

Lets call it recursive_resolve_and_validate_document.

Another option is to use the autogenerated Python CWL parser in cwl-utils. Though it doesn't produce identical messages as cwltool.

Specifically, compare to using cwl_utils.parser_v1_2.load_document(path) or cwl_utils.parser_v1_2.load_document_by_string(example_string), that might be simpler.

mr-c commented 3 years ago

Thank you @simleo ; a new cwltool version has been released with your new function: https://github.com/common-workflow-language/cwltool/releases/tag/3.0.20201117141248