The f62aa607a75508ac5fc6a22e9c0e39ef58a2c852 file is listed in the workflow exec action's object, but it should not be (note it has no matching formal parameter, hence the ??? in the consume_crate dump). It's just a part of the input collection
The tool does not find any match for predictions.cwl#tissue and predictions.cwl#tumor. This is due to the fact that the corresponding file entities only point to the tool-level parameter in exampleOfWork. For instance, e75ee6d0751bb2dbc7a894af0fe307e4fe020a99 should have an exampleOfWork of [{"@id": "predictions.cwl#tumor"}, {"@id": "classify_tumor.cwl#tumor"}], rather than just {"@id": "classify_tumor.cwl#tumor"}
The parameters reported for the extract-tissue-low step are a copy of those passed to extract-tissue-high. My guess is that the correct parameters were set at some point and then they were overwritten, probably because those steps run the same tool
There are three Collection entities in the crate which represent the same dataset. There should be only one, and it should be linked to from the root dataset's mentions
cwltool + runcrate crate:
packed.cwl#main/slide has the collection's main entity (the f62aa607a75508ac5fc6a22e9c0e39ef58a2c852 file) instead of the whole collection as the corresponding object item. This is due to a cwltool problem: it only lists cwlprov:secondaryFile entries for the tools, but not for the workflow.
data entities don't have alternateName yet. For directories, a PR to cwltool is required (Folder entries don't have basename).
Note that I've recently moved the actual input data to docs/examples/data/Mirax2-Fluorescence-2 and now all crates / RO bundles have symlinks to those files (they're pretty big, so we don't want to duplicate them). docs/examples/data/link_mirax.py can help replace the actual files with the symlinks.
@GlassOfWhiskey I've compared the metadata with those from the StreamFlow crate using consume_crate.py. Here are the outputs:
StreamFlow crate:
cwltool + runcrate crate
Here are the issues that I've spotted:
Streamflow crate:
f62aa607a75508ac5fc6a22e9c0e39ef58a2c852
file is listed in the workflow exec action'sobject
, but it should not be (note it has no matching formal parameter, hence the???
in theconsume_crate
dump). It's just a part of the input collectionpredictions.cwl#tissue
andpredictions.cwl#tumor
. This is due to the fact that the corresponding file entities only point to the tool-level parameter inexampleOfWork
. For instance,e75ee6d0751bb2dbc7a894af0fe307e4fe020a99
should have anexampleOfWork
of[{"@id": "predictions.cwl#tumor"}, {"@id": "classify_tumor.cwl#tumor"}]
, rather than just{"@id": "classify_tumor.cwl#tumor"}
extract-tissue-low
step are a copy of those passed toextract-tissue-high
. My guess is that the correct parameters were set at some point and then they were overwritten, probably because those steps run the same toolCollection
entities in the crate which represent the same dataset. There should be only one, and it should be linked to from the root dataset'smentions
cwltool + runcrate crate:
packed.cwl#main/slide
has the collection's main entity (thef62aa607a75508ac5fc6a22e9c0e39ef58a2c852
file) instead of the whole collection as the correspondingobject
item. This is due to a cwltool problem: it only listscwlprov:secondaryFile
entries for the tools, but not for the workflow.alternateName
yet. For directories, a PR to cwltool is required (Folder
entries don't havebasename
).Note that I've recently moved the actual input data to
docs/examples/data/Mirax2-Fluorescence-2
and now all crates / RO bundles have symlinks to those files (they're pretty big, so we don't want to duplicate them).docs/examples/data/link_mirax.py
can help replace the actual files with the symlinks.