nipype / pydra

Pydra Dataflow Engine
https://nipype.github.io/pydra/
Other
120 stars 59 forks source link

WIP - BEP 028 prov #750

Open yibeichan opened 5 months ago

yibeichan commented 5 months ago

This PR was made during the 2024 BIDS meeting in Seattle after discussions with @effigies, some other discussions can be found here #https://github.com/bids-standard/BEP028_BIDSprov/issues/129 BEP028_prov doc see here

Types of changes

context.jsonfile

audit.py

Summary

so the above changes are based on what we have in the prov doc, we probably need to dive deeper into audit.py and messenger.py (worth discussing in the next pydra meeting @djarecka) @effigies suggested we collect messages for a workflow to generate prov records. for example, we can collect all messages at finalize_audit level into FileMessenger using collect_messages

Checklist

satra commented 5 months ago

@yibeichan - i don't think we should drop the prov context - we are still based on the prov model. that's more general and the bids context should still conform to prov.

yibeichan commented 5 months ago

@satra do you mean the original one? openprov context so we don't have to directly use/cite BEP028 prov context in pydra?

satra commented 5 months ago

essentially bids prov was a simplification to keep the keys readable to people and perhaps should stay. however, the openprov context should be included or referenced.

here is a linkml based context generated for prov: https://github.com/linkml/linkml-prov/blob/main/prov/jsonld/prov.model.context.jsonld

technically speaking we should be able to generate whatever we want as prov and then transform it to bep28 if we wanted using the bids context. pydra doesn't have to generate bep28, but could result in bep28. the following should work.

pydra jsonld + context -> expand -> compact/frame using bep28 context.

djarecka commented 5 months ago

@satra - what is relation between the context from the openprov repo and the one from linkml? do you suggest switching to the one from linkml. Seems to me that openprov doesn't have many useful objects, like wasGeneratedBy, etc, that is part of bids-prov

When you're saying transformation to bids-prov, you mean we would have to ignore some of the properties. I understand that they both point to prov model, just the coverage is different.

satra commented 5 months ago

the linkml one looks more comprehensive and since chris did it, i have some trust in it as well. so i would lean towards using it and that also allows us to use the linkml model for other things.