Each provenance data point reflects how a particular data value was created or captured. It tells about all the input data points which contributed to its creation along with the contribution type. The list of the input data points can be null in the case when a data value is captured directly from the sensor.
Along with the input data points, a provenance data point also has some other information about the context in which the data value was created. This context information includes following metrics:
Line of the Code
The line number of the code that generated a particular data value.
Physical location
The physical location of the node where a data value is being created.
Class name
A class which is responsible for the creation of data value.
Application name
An application which is responsible for the creation of data value.
Send time
Time, when a particular data value is forwarded to next nodes.
Receive time
Time, when a particular data value is received/captured from the previous node or from the sensor.
Creation time
In the case when a node is performing some aggregation and not simply forwarding data values, creation time is the time when that data value is generated.
Node identifier
An identity of the physical node.
In the diagram above, Circle represents a particular provenance data point. The oval inside the circle represents the context in which a particular data value was created and the inner rectangle tells about the set of data values along with the type of transformation/aggregation that caused the creation of new data value.
A provenance function prov can be defined as each provenance record is basically a tuple associating the identity of the data value, identity of the input data values and the specific context information. So, the provenance function prov(dataValue) will look like as follows:
Data Model:
Each provenance data point reflects how a particular data value was created or captured. It tells about all the input data points which contributed to its creation along with the contribution type. The list of the input data points can be null in the case when a data value is captured directly from the sensor. Along with the input data points, a provenance data point also has some other information about the context in which the data value was created. This context information includes following metrics:
The line number of the code that generated a particular data value.
The physical location of the node where a data value is being created.
A class which is responsible for the creation of data value.
An application which is responsible for the creation of data value.
Time, when a particular data value is forwarded to next nodes.
Time, when a particular data value is received/captured from the previous node or from the sensor.
In the case when a node is performing some aggregation and not simply forwarding data values, creation time is the time when that data value is generated.
An identity of the physical node.
In the diagram above, Circle represents a particular provenance data point. The oval inside the circle represents the context in which a particular data value was created and the inner rectangle tells about the set of data values along with the type of transformation/aggregation that caused the creation of new data value.
A provenance function prov can be defined as each provenance record is basically a tuple associating the identity of the data value, identity of the input data values and the specific context information. So, the provenance function prov(dataValue) will look like as follows:
prov(dataValue) = <id(dataValue), {prov(inputDataValue) | inputDataValue ∈ input(dataValue)}, context(dataValue)>