yesworkflow-org / yw-prototypes

Research prototype with tutorial. Start here to learn about and try YesWorkflow.
http://yesworkflow.org/wiki
Other
33 stars 13 forks source link

PROV and ProvONE Compatibility? #29

Open olyerickson opened 8 years ago

olyerickson commented 8 years ago

The next logical step is the generation of PROV-compatible RDF. Where is YW w.r.t. the following (from one of the YW papers):

"...Similarly, to improve YW interoperability within the DataONE infrastructure, PROV (Moreau & Missier, 2013) and ProvONE (Cuevas-Vicenttín et al., 2015) compatible vocabulary extensions may be used in YesWorkflow in the future..."

Thanks!

tmcphillips commented 8 years ago

We definitely can work towards exporting to PROV-compatible RDF the same information now exportable to the prolog/datalog facts files.The first step might be to put together some example RDF documents by hand showing exactly what this would look like. Some example applications or tests that illustrate how the RDF will be used would be helpful as well. It'd be great to work with you on this!

olyerickson commented 8 years ago

Beware over-simplifying the task of "exporting PROV-compatible RDF!" As you guys know better than most people, exemplars for prospective provenance for workflows of the type we're describing are few and far between. ProvONE provides some abstract examples, but nothing very "real." We're in essence creating some real examples with this work...

tmcphillips commented 8 years ago

I see this as a challenging task for the reason you cite along with others. I generally am reluctant to generate files that are supposed to be "machine readable" when I'm not sure I know exactly what "machine" we're talking about. Having an initial set of example workflows, proposed RDF representations of these workflows, and provenance queries that employ the RDF (either directly or indirectly via a triplestore) is essential before started work on an implementation that automatically produces RDF documents (while leaving open the option of changing everything as we go, of course).

What I can do right away is create a new git repo and maven pom for this extension to YW, make it dependent on yw-prototypes and any Java libraries we need. We can then start adding to the repo the workflows and proposed RDF files corresponding to them, along with anything we need to exercise the RDF documents.