ResearchObject / workflow-run-crate

Workflow Run RO-Crate profile
https://www.researchobject.org/workflow-run-crate/
Apache License 2.0
8 stars 9 forks source link

Proposed draft changes to allow for a galaxy history export to be accepted by cwlprov_to_crate.py #30

Closed pauldg closed 6 months ago

pauldg commented 2 years ago

Note: We use a history export for now since galaxy does not yet support the export of workflow runs, but it is safe to assume that the metadata structure will be largely the same.

Proposed (working) changes to cwlprov_to_crate.py to allow a galaxy history export (and a workflow definition derived from the galaxy history) as input. This includes the extraction of a CWLProv document from the galaxy history export metadata.

Changes:

  1. To differentiate between a "galaxy workflow run" and a cwl workflow run I added a workflow-type argument. Therefore I had to move the location where the global WORKFLOW_BASENAME was defined to __init__ I differentiate between "galaxy workflow run" and a cwl workflow run in the following functions: __init__, build and add_action

  2. To include a the galaxy workflow definition I created the add_ga_workflow function

  3. TODO: reading in the galaxy workflow definition in similar fashion as get_workflow does for the cwl workflow definition. As I'm missing this part, I'm also missing some of the fields in ro-crate-metadata.json

Please let me know any suggestions.

pauldg commented 1 year ago

Thank you for these changes, looks good! As for the issues you identified, I think most of those can indeed be resolved by parsing the Galaxy workflow definition.

The workflow used comes from a Galaxy tutorial, I will credit it in the README

stain commented 1 year ago

As discussed last week, we'll split out runcrate as a separate GitHub repo and then we can re-raise this pull request there. Leaving this issue open until then.

pauldg commented 1 year ago

Now that https://github.com/galaxyproject/galaxy/pull/15101 was merged, I'd like to integrate developments made there back into this work before re-raising this pull request.

stain commented 6 months ago

Closing as we think this is now all in Galaxy's code