ATTX-project / graph-component

Graph Manager component that handles state of the internal graph/data.
0 stars 0 forks source link

Provenance service Implementation derived from GM-API #28

Closed blankdots closed 7 years ago

blankdots commented 7 years ago

Description

Moving to architecture v2.0 requires changes to the GM-API and adding new services to the Semantic Broker.

DoD

Changes to GM-API and first version of the provenance service as illustrated in https://attx-project.github.io/ATTX-Architecture-Overview.html

Testing

Unit tests and Feature tests.

blankdots commented 7 years ago

@jkesanie Opinions for examples below? Example of information received after a UV pipeline is executed, or during execution, how the request is structured: The activityID = executionID, workflowID = pipelineID, stepID = dpuID

{   
    "id": {
        "activityID":134,
        "workflowID": 135,
        "agentID": "UV"
    }, 
    "agent": {
        "prov:hadRole": "ETL",
        "name": "UnifiedViews",
        "dcterms:source": "http://unifiedviews.eu"
    },
    "activity": {
        "executedSuccessful": true,
        "prov:startedAtTime": "",
        "prov:enededAtTime": "",
        "prov:used": "",
        "prov:generated": ""
    },
    "workflow": {
        "private": false,
        "dcterms:title": "title of the workflow",
        "dcterms:description": "description of the workflow",
        "steps": [
            {
                "stepID": 12515,
                "isFirstStep": true,
                "dcterms:title": "",
                "dcterms:description": "",
                "pwo:hasNextStep": 12516
            },
            {
                "stepID": 12516,
                "dcterms:isFirstStep": false,
                "dcterms:title": "",
                "dcterms:description": "",
                "pwo:hasNextStep": 12517
            },
            {
                "stepID": 12517,
                "isFirstStep": false,
                "dcterms:title": "",
                "dcterms:description": ""
            }
        ]
    }
}

Example of UVProv provenance response, activityID, workflowID will be extracted from UV in order to match them with any existing activities or workflows, the stepID will most likely be only one of thus it will be static, unless we plan to include configuration, then it will be a hash based on time of the response.

{   
    "id": {
        "activityID":134,
        "workflowID": 135,
        "agentID": "UVProv"
    }, 
    "agent": {
        "prov:hadRole": "API",
        "name": "UVProvenanceAPI",
        "dcterms:source": "http://unifiedviews.eu",
        "prov:actedOnBehalfOf": "ProvenanceService"
    },
    "activity": {
        "executedSuccessful": true,
        "prov:startedAtTime": "",
        "prov:enededAtTime": "",
        "prov:used": "",
        "prov:generated": ""
    },
    "workflow": {
        "private": false,
        "dcterms:title": "title of the workflow",
        "dcterms:description": "description of the workflow",
        "steps": [
            {
                "stepID": 12515,
                "isFirstStep": true,
                "dcterms:title": "",
                "dcterms:description": "",
            }
        ]
    }
}
jkesanie commented 7 years ago

Service registry can be used to get part of the Provenance data.

ServiceRegistry For example:

GMApi

jkesanie commented 7 years ago

ActiveMQ & Camel based proposal for recording the communications between components with minimal prov related data in the messages.


Version 2


TODO:

M0 - step started

provContext: { "workflowID": 1, "activityID": 1, "stepID": 1, }, activity: { "name": "Transform RDF", "startTime": "X", "used": "attx:dataset1" }

--> attx:workflow1_activity1_step1 a prov:Activity, attxonto:Step ; // from stepID dcterms:title "Transform RDF" ; prov:startedAtTime "X"^^xsd:dateTime ; prov:qualifiedAssociation [ a prov:Association ; prov:agent attx:UV ; prov:hadRole attx:ETL ; ] ; prov:used attx:dataset1 ;

M0-2 - step ended

provContext: { "workflowID": 1, "activityID": 1, "stepID": 1, }, activity: { "name": "Transform RDF", "startTime": "Y", "generated": file://file.with/json-result.json }

--> attx:workflow1_activity1_step1 prov:generated file://file.with/json-result.json .

M1

provContext: { "workflowID": 1, "activityID": 1, "stepID": 1, }, messageContext: { "sender": "", "receiver": "Framer", "messageID": "message1", }

payload: { "inputgraphs": "attx:dataset1", "frame": "json content" }

--> // sender had communication - if no explicit activity --> Activity with Communication attx:workflow1_activity1_step1 // from provContext prov:qualifiedCommunication [ a prov:prov:Communication ; prov:activity attx:workflow1_activity1_step1_Framer ; // from destination queue ] ;

// receiver used something attx:workflow1_activity1_step1Framer // {workflowID}{activityID}{stepID}{receiver}_{messageID} a prov:Activity ; prov:used attx:workflow1_activity1_step1_Framer_used_frame ; attx:workflow1_activity1_step1_Framer_used_inputgraphs ;

// from the payload attx:workflow1_activity1_step1_Framer_used_frame a prov:Entity ; dcterms:source "json content" .

attx:workflow1_activity1_step1_Framer_used_inputgraphs a prov:Entity ; dcterms:source attx:dataset1

M2 provContext: { "workflowID": 1, "activityID": 1, "stepID": 1, }, "agent": "Framer" "receiver": "GMAPI" "messageID": "message2", payload: { "inputgraphs": "attx:dataset1", } QUEUE: GMAPI

attx:workflow1_activity1_step1_Framer prov:qualifiedCommunication [ a prov:prov:Communication ; prov:activity attx:workflow1_activity1_step1_GMAPI ; ] ;

// receiver used something attx:workflow1_activity1_step1_GMAPI a prov:Activity ; prov:used attx:workflow1_activity1_step1_GMAPI_used_inputgraphs ;

attx:workflow1_activity1_step1_GMAPI_used_inputgraphs a prov:Entity ; dcterms:source attx:dataset1

M3 provContext: { "workflowID": 1, "activityID": 1, "stepID": 1, }, "agent": "GMAPI" "receiver": "tempQueue", payload: { "graph_content_URI": "file://file.with/graph-content.ttl" } QUEUE: tempQueue - reply to message2

// no communication because it is reply?

// sender generated something because it is a reply/response attx:workflow1_activity1_step1_GMAPI a prov:Activity ; prov:generated attx:workflow1_activity1_step1_GMAPI_gen_graph_content_URI ;

attx:workflow1_activity1_step1_GMAPI_gen_graph_content_URI a prov:Entity ; dcterms:source file://file.with/graph-content.ttl

M4 provContext: { "workflowID": 1, "activityID": 1, "stepID": 1, }, "sender": "Framer" "receiver": "tempQueue", payload: { "graph_content_URI": "file://file.with/graph-content.ttl" }

QUEUE: tempQueue - reply to message1

attx:workflow1_activity1_step1_Framer a prov:Activity ; prov:generated attx:workflow1_activity1_step1_Framer_gen_graph_content_URI ;

attx:workflow1_activity1_step1_Framer_gen_graph_content_URI a prov:Entity ; dcterms:source file://file.with/graph-content.ttl