INCF / ProvenanceLibrary

Other
9 stars 8 forks source link

Core library discussion #1

Open satra opened 12 years ago

satra commented 12 years ago

Prototype implementation of the W3C Provenance Data Model

The src directory contains a C prototype library for implementing the W3C provenance schema. Please note that the schema is a work in progress. We will continue to monitor and improve the implementation.

A demo program in C is available.

W3C Prov Working Group descriptions

Click on raw to get rendered html

Data model


http://dvcs.w3.org/hg/prov/file/tip/model/ProvenanceModel.html

Ontology

http://dvcs.w3.org/hg/prov/file/tip/ontology/ProvenanceFormalModel.html

Upstream schema


https://github.com/lucmoreau/ProvToolbox/blob/master/xml/src/main/resources/prov-20111110.xsd
SyamGadde commented 12 years ago

I have a suggestion -- I understand what the add_child_element and add_attribute functions in provenance.h do to the output XML. They seem to be used/overloaded for adding elements that the PROV data model calls "attribute", which is an arbitrary name/string value pair which can be added to most (all?) records. It is not explicitly mapped in the latest PROV XML schema but I assume it is the reason for the entries that look like this:

<xs:any namespace="##other" maxOccurs="unbounded" minOccurs="0"/>

Since the PROV data model only allows for string attribute values, would you consider the following function:

int addAttribute(RecordPtr p_record, const char *prefix, const char* localName, const char* value);

And then using that as a replacement for any calls to add_child_element that are actually for adding PROV attributes? (like all the add_child_element() calls in testprov.c and testneuroprov.c, and some in provenance.c and neuroprovenance.c). That way add_child_element() and add_attribute can be completely internal to the library and therefore no one can add data that doesn't map directly to the PROV data model. Plus the namespace prefix can be explicitly separated which would be convenient for users. If the prefix is NULL, it is assumed to be the default namespace, and if prefix is not NULL, it would check to make sure that that prefix has been previously declared using addNamespace().

I would then get rid of the add_attribute() function, which is not used anyway. What do you think?

satra commented 12 years ago

thanks @SyamGadde. done. i also added newAccount to the base library as that was missing. i need to add a few more things related to annotation, reading in provenance docs. will be pushing in a bit.

satra commented 12 years ago

@SyamGadde pushed both neuroprov and provenance. if you can take a look at the examples and provide some feedback that will be great.

SyamGadde commented 12 years ago

Thanks Satra. I think it looks good, though all the main elements should probably have the PROV XML namespace, which might be fixed with something like:

diff --git a/src/provenance.c b/src/provenance.c
index 776fe44..f773e73 100644
--- a/src/provenance.c
+++ b/src/provenance.c
@@ -53,6 +53,12 @@ static XPathQueryPtr query_xpath(const xmlDocPtr doc, const xmlChar* xpathExpr)
         fprintf(stderr,"Error: unable to create new XPath context\n");
         return(NULL);
     }
+    xpathCtx->namespaces = xmlGetNsList(doc, xmlDocGetRootElement(doc));
+    xpathCtx->nsNr = 0;
+    if (xpathCtx->namespaces != NULL) {
+   while (xpathCtx->namespaces[xpathCtx->nsNr] != NULL)
+       xpathCtx->nsNr++;
+    }

     /* Evaluate xpath expression */
     xpathObj = xmlXPathEvalExpression(xpathExpr, xpathCtx);
@@ -129,7 +135,7 @@ static RecordPtr newRecord(ProvPtr p_prov)
 {
     assert(p_prov);
     PrivateProvPtr p_priv = (PrivateProvPtr)(p_prov->private);
-    XPathQueryPtr p_xpquery = query_xpath(p_priv->doc, "/container");
+    XPathQueryPtr p_xpquery = query_xpath(p_priv->doc, "/prov:container");

     int size = (p_xpquery->xpathObj->nodesetval) ? p_xpquery->xpathObj->nodesetval->nodeNr : 0;
     if (size != 1){
@@ -169,6 +175,10 @@ ProvPtr newProvenanceFactory(const char* id)
     xmlNewNs(root_node, "http://www.w3.org/2001/XMLSchema", "xsd");
     xmlNewNs(root_node, "http://openprovenance.org/prov-xml#", "prov");
     xmlNewNs(root_node, "http://incf.org/incf-schema", "incf");
+    {
+   xmlNsPtr provns = xmlNewNs(root_node, "http://openprovenance.org/prov-xml#", NULL);
+   root_node->ns = provns;
+    }

     //fprintf(stdout, "Creating provenance object [END]\n");
     p_prov->p_record = (void *)newRecord(p_prov);
@@ -197,7 +207,7 @@ ProvPtr newProvenanceFactoryFromFile(const char* filename)
     }
     xmlNodePtr root_node = xmlDocGetRootElement(p_priv->doc);
     p_prov->id = strdup(xmlGetProp(root_node, BAD_CAST "id"));
-    XPathQueryPtr p_xpquery = query_xpath(p_priv->doc, "/container/records");
+    XPathQueryPtr p_xpquery = query_xpath(p_priv->doc, "/prov:container/prov:records");
     int size = (p_xpquery->xpathObj->nodesetval) ? p_xpquery->xpathObj->nodesetval->nodeNr : 0;
     if (size != 1){
         fprintf(stderr, "No records element found\n");
@@ -238,7 +248,7 @@ ProvPtr newProvenanceFactoryFromMemoryBuffer(const char* buffer, int bufferSize)
     xmlChar* p_id = xmlGetProp(root_node, BAD_CAST "id");
     assert(p_id);
     p_prov->id = strdup(p_id);
-    XPathQueryPtr p_xpquery = query_xpath(p_priv->doc, "/container/records");
+    XPathQueryPtr p_xpquery = query_xpath(p_priv->doc, "/prov:container/prov:records");
     int size = (p_xpquery->xpathObj->nodesetval) ? p_xpquery->xpathObj->nodesetval->nodeNr : 0;
     if (size != 1){
         fprintf(stderr, "%d records element found in container id[%s] \n", size, p_prov->id);

Also, I still think addAttribute needs a way for us to specify the namespace prefix of your attributes so you can distinguish between PROV standard attributes and your new attributes. I haven't worked on that but if you are willing I can try something.

satra commented 12 years ago

that will be great. do you want to send a pull request with the above patch?

SyamGadde commented 12 years ago

Sure, let me research how to get git to do that. :-)

-syam

On 01/18/2012 06:32 PM, Satrajit Ghosh wrote:

that will be great. do you want to send a pull request with the above patch?


Reply to this email directly or view it on GitHub: https://github.com/INCF/ProvenanceLibrary/issues/1#issuecomment-3555254