ResearchObject / workflow-run-crate

Workflow Run RO-Crate profile
https://www.researchobject.org/workflow-run-crate/
Apache License 2.0
8 stars 9 forks source link

cwlprov_to_crate: test converting cwlprovs created for each CWL conformance tests #24

Open mr-c opened 1 year ago

mr-c commented 1 year ago

https://github.com/common-workflow-language/cwl-v1.2/blob/1.2.1_proposed/CONFORMANCE_TESTS.md

@mr-c will test creation of the CWLProv RO Bundles (fixing any issues found), and then he will help @simleo create his own for testing with cwlprov_to_crate

mr-c commented 1 year ago

To make a CWLProv RO Bundle for each conformance test, do the following

python3 -m venv cwlprov_env
. cwlprov_env/bin/activate
pip install cwltool cwltest
git clone https://github.com/common-workflow-language/cwl-v1.2/
cd cwl-v1.2
mkdir cwlprovs
cd cwlprovs
cwltest --tool cwltool --test ../conformance_tests.yaml --junit-verbose --junit-xml cwlprov_cwl_v1.2_full_results.xml --test-arg short_name==--provenance --verbose

The CWLProv RO Bundles will be in that cwlprovs directory, each directory named using the id field from https://github.com/common-workflow-language/cwl-v1.2/blob/1.2.1_proposed/conformance_tests.yaml

cwlprov_cwl_v1.2_full_results.xml will contain a complete log of all the runs

Initial results:

354 tests passed, 10 failures, 0 unsupported features

mr-c commented 1 year ago

Summary of failures:

  1. legal_symlink; "symlink to file inside of working directory should be retrieved"
  2. modify_directory_content ; missing the side-effect from using InplaceUpdateRequirement
  3. listing_default_none Load listing issues. I think we force "deep" Directory listings to capture more provenance
  4. listing_requirement_none Load listing issues. I think we force "deep" Directory listings to capture more provenance
  5. listing_loadListing_none Load listing issues. I think we force "deep" Directory listings to capture more provenance
  6. listing_requirement_shallow Load listing issues. I think we force "deep" Directory listings to capture more provenance
  7. listing_loadListing_shallow Load listing issues. I think we force "deep" Directory listings to capture more provenance
  8. mixed_version_v10_wf "test v1.0 workflow document that runs other version"
  9. mixed_version_v11_wf another mixed version failure
  10. invalid_syntax_mixed_v12_workflow (should have failed, but didn't)
simleo commented 1 year ago

The CWLProv RO Bundles will be in that cwlprovs directory, each directory named using the short_name field from https://github.com/common-workflow-language/cwl-v1.2/blob/1.2.1_proposed/conformance_tests.yaml

I don't see a short_name field in that file, what am I missing? BTW, the direct link is not working, I had to open the repository page and click on the file (GitHub problem?). After running the above command I got 16 RO bundles, is that expected? Here's the list:

dotproduct-dotproduct-scatter
dotproduct-simple-scatter
flat-crossproduct-flat-crossproduct-scatter
flat-crossproduct-simple-scatter
js-input-record
multiple-input-feature-requirement
nested-crossproduct-nested-crossproduct-scatter
nested-crossproduct-simple-scatter
output_reference_workflow_input
record-order-with-input-bindings
schemadef_types_with_import
simple-dotproduct-scatter
simple-flat-crossproduct-scatter
simple-nested-crossproduct-scatter
simple-simple-scatter
stdout_chained_commands

Regarding the tests, I got only 3 failures, but I don't know which ones they were. cwlprov_cwl_v1.2_full_results.xml entries are like:

<testcase name="Test ExpressionTool with Javascript engine" time="4.781120" class="inline_javascript, expression_tool" url="cwltest:conformance_tests#22">
  <system-out>{
    &quot;output&quot;: 42
  }
  </system-out>
</testcase>

I can't see any error dumps or test outcome fields. The console dump does not seem to help much: there are more than 3 error messages there, so I guess some of them are expected or not relevant to test outcomes.

mr-c commented 1 year ago

I don't see a short_name field in that file, what am I missing?

Sorry, that field is called id in that file

BTW, the direct link is not working, I had to open the repository page and click on the file (GitHub problem?).

That is odd, and I can also confirm!

After running the above command I got 16 RO bundles, is that expected? Here's the list:

Huh, I got 365 RO bundles: ro_test.zip

I can't see any error dumps or test outcome fields.

Yeah, JUnit XML format is weird. Only <testcases> with <failure type="failure"> inside are failures; the other <testcases> were successful and can be ignored.

simleo commented 1 year ago

I tried again with id==--provenance instead of short_name==--provenance and I got 358 RO bundles. The tool tried to save them using full URIs as if they were local paths, so I got everything under a file:<ABSOLUTE_CWD_PATH> directory. Most ROs have a dir name like conformance_tests.yaml#writable_stagedfiles, others are under tests/<SUBDIR>/, e.g. tests/conditionals/test-index.yaml#conditionals_non_boolean_fail.

GlassOfWhiskey commented 1 year ago

At the moment the Ro Crate archive reports a CWL fake folder (e.g. a Directory dictionary generated with an ExpressionTool) as a common Dataset, together with a File or Dataset entry for each element into its deep_listing hierarchy. CWLProv used a Proxy entity to represent this kind of things. But I am not aware of anything like that in Run Crate

As discussed on Slack, reproducing this behaviour 1-to-1 is probably out of the scope of the Run-Crate provenance, but I leave a reminder here for potential future discussions.