Closed pauldg closed 8 months ago
Thank you for these changes, looks good! As for the issues you identified, I think most of those can indeed be resolved by parsing the Galaxy workflow definition.
The workflow used comes from a Galaxy tutorial, I will credit it in the README
As discussed last week, we'll split out runcrate
as a separate GitHub repo and then we can re-raise this pull request there. Leaving this issue open until then.
Now that https://github.com/galaxyproject/galaxy/pull/15101 was merged, I'd like to integrate developments made there back into this work before re-raising this pull request.
Closing as we think this is now all in Galaxy's code
Note: We use a history export for now since galaxy does not yet support the export of workflow runs, but it is safe to assume that the metadata structure will be largely the same.
Proposed (working) changes to cwlprov_to_crate.py to allow a galaxy history export (and a workflow definition derived from the galaxy history) as input. This includes the extraction of a CWLProv document from the galaxy history export metadata.
Changes:
To differentiate between a "galaxy workflow run" and a cwl workflow run I added a workflow-type argument. Therefore I had to move the location where the global
WORKFLOW_BASENAME
was defined to__init__
I differentiate between "galaxy workflow run" and a cwl workflow run in the following functions:__init__
,build
andadd_action
To include a the galaxy workflow definition I created the
add_ga_workflow
functionTODO: reading in the galaxy workflow definition in similar fashion as
get_workflow
does for the cwl workflow definition. As I'm missing this part, I'm also missing some of the fields inro-crate-metadata.json
Please let me know any suggestions.