BioDT / biodt-fair

FAIR guidelines, tutorials, and other materials for BioDT.
MIT License
4 stars 2 forks source link

Use bioschema's ComputationalWorkflow profile to model pdts #6

Open jgrieb opened 6 months ago

jgrieb commented 6 months ago

On the second BioDT hackathon in Oslo we worked on the RO-Crate representation of the CWR use case data. We built an example for the output data here: https://github.com/jgrieb/CWR-Hackathon/blob/ro-crate-manual-example/ModGP/example-output/Lathyrus_aphaca/ro-crate-metadata.json and would like to link the data back to the tool via the RO-Crates provenance. For the tool we created a separate RO-Crate example here: https://github.com/jgrieb/CWR-Hackathon/blob/ro-crate-manual-example/ModGP/tool-ro-crate-metadata.json . As you can see we tried to comply with bioschema's ComputationalWorkflow profile which seems to be designed exactly for this type of model code, we would therefore suggest to add this also to the existing RO-Crate example for the other pdts.

See also some further notes on this topic here: https://github.com/uio-mana/CWR-Hackathon/issues/1#issuecomment-1909547333 - would be happy to hear your opinion on the suggested repositories (WorkflowHub for the pdt model code and ROHub for the output) @juliancervos

juliancervos commented 5 months ago

I had a look at the links you shared, it looks very nice. I agree with your choice for the ComputationWorkflow profile, it would also integrate well with the more comprehensive Workflow Run RO-Crate (in case we need it). I will include the example and develop the workflow part further in this repo.

Both WorkflowHub and ROHub sound good, but I can't say anything about what the standard repositories in BioDT are going to be for things like model outputs. LUMI-O was suggested during an open WP3 meeting last 11th of December. In any case, I would go for the options you suggest, given that they are a natural integration with RO-Crate, and we can develop from there.

juliancervos commented 5 months ago

A small update on this; I've included the ModGP example on this repo, including some modifications (c4cc391):

I've also added a page on Workflows (#7) with some basic descriptions and referencing Bioschemas. Pretty basic for now, but it would be interesting to develop this further, so @jgrieb let me know about any changes coming from the CWR pDT.