NCEAS / open-science-codefest

Web site and planning materials for open science conference.
http://nceas.github.io/open-science-codefest
12 stars 10 forks source link

Enable provenance capture in R #14

Open gothub opened 10 years ago

gothub commented 10 years ago

Organizational Page: ProvenanceR Category: Design Title: Enable provenance capture in R Proposed by: Peter Slaughter Participants: Summary:

Provenance describes the origin and processing history of a data product, visualization or other artifact associated with a computational process. The PROV Model, a W3C Recommendation, defines a standard for representing processing provenance. This session will evaluate PROV and design an R package that uses the PROV standard to capture processing provenance.

W3C PROV: http://www.w3.org/TR/2013/NOTE-prov-overview-20130430/

gothub commented 10 years ago

Add appropriate labels

naupaka commented 10 years ago

I'd definitely second this one. It's a key thing that I think needs to get added to the default computational science workflow...

laurenwalker commented 10 years ago

Nice idea Peter. I'm interested in attending this one. I am working on inserting provenance information to DataONE using the DataONE R Client, so it would be nice to see this provenance capture work seamlessly with that.

emhart commented 10 years ago

@cboettig and I have had some discussions about developing this package under the aegis of ropensci. I think this would be a great place to begin having some good discussions about the standard, how to implement it in R seamlessly with a variety of packages (thinking eml, but other data access packages too). I'm definitely interested in this.

mbjones commented 10 years ago

That's great, @emhart. Lauren, Ben and I have been doing some prototyping, and have a PROV-O based annotation model in prototype, and Lauren has started implementing an annotation API in the rdataone package. It would be great to work more on this at the Codefest. We're planning on implementing the same API in other analytical tools, particularly Matlab, python, and some workflow systems like Kepler and VisTrails. So it will be nice to have a model that works across these various languages and systems.

emhart commented 10 years ago

I know that @karthik is working on this package for R already actually, and the project is in a bit of a hiatus. He's agreed to discuss with his collaborators how we could work on it at the codefest. He'll get back to us in the next couple of weeks with how best to proceed.

dlebauer commented 10 years ago

Another project along these lines is Aaron Ellison's Analytic Web (@amellison17 ) (presentation)

naupaka commented 10 years ago

Been poking around in https://github.com/blernermhc/RDataTracker (h/t @dlebauer 's comment) Seems like there might indeed be productive overlap in the two projects...

mbjones commented 10 years ago

Thanks, @naupaka , that looks really useful.

amellison17 commented 10 years ago

Thanks Matt and Naupaka; you all should bring Barbara Lerner (blerner@mtholyoke.edumailto:blerner@mtholyoke.edu) and Emery Boose (boose@fas.harvard.edumailto:boose@fas.harvard.edu) into this thread... They're the prime developers of RDataTracker.

Best, Aaron

From: Matt Jones [mailto:notifications@github.com] Sent: Sunday, September 14, 2014 1:01 AM To: NCEAS/open-science-codefest Cc: Ellison, Aaron Subject: Re: [open-science-codefest] Enable provenance capture in R (#14)

Thanks, @naupakahttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.comnaupaka&d=AAMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=IgvLER1TPrkyQJ-k1af5glFHivwO8KCs-7dWZhIbJQ&m=dVuUxKUnMGx5rM2i8RHqSGCyAkp4kfRUgqytnG77_NA&s=3rGGhC-zlyKzZd-IICAQGELeB8WoXCQsenWzxocA5Ho&e= , that looks really useful.

— Reply to this email directly or view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_NCEAS_open-2Dscience-2Dcodefest_issues14-23issuecomment-2D55515633&d=AAMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=IgvLER1TPrkyQJ-k1af5glFHivwO8KCs-7dWZhIbJQ&m=dVuUxKUnMGx5rM2i8RHqSGCyAkp4kfRUgqytnG77_NA&s=5tpb1OiQYC0ZJzrO7Z1KZXTpZXml7mjEamtF5v071MU&e=.

naupaka commented 10 years ago

cc @erboose and @blernermhc for their input on this