ATTX-project / graph-component

Graph Manager component that handles state of the internal graph/data.
0 stars 0 forks source link

Implement ID clustering in Graph Manager #10

Closed blankdots closed 7 years ago

blankdots commented 7 years ago

Description

Current implementation of ID clustering is done in a UnifiedViews DPU. Move that implementation to Graph Manager. DPU implementation uses hardcoded data graphs, this implementation we should work on all graphs. Such information can be queried from the provenance graph.

Add this functionality to the GM API.

DoD

Content of the IDs graph is updated automatically based on a ~schedule~ (on API call, schedule will be done from Unified Views). a.k.a. extending the graph manager.

Testing

Unit test and/or BDD tests.

blankdots commented 7 years ago

The unit tests will execute properly if there is Fuseki and Elasticsearch running.

--------------- coverage: platform linux2, python 2.7.6-final-0 ----------------
Name                             Stmts   Miss  Cover
----------------------------------------------------
src/gm_api/__init__                  0      0   100%
src/gm_api/api/__init__              0      0   100%
src/gm_api/api/clusterids           14      0   100%
src/gm_api/api/entitieslinks        19      0   100%
src/gm_api/api/mapping              32      1    97%
src/gm_api/app                      19      1    95%
src/gm_api/gmapi                    29      6    79%
src/gm_api/lib/__init__              0      0   100%
src/gm_api/lib/construct_ids        65     12    82%
src/gm_api/lib/construct_links       0      0   100%
src/gm_api/lib/construct_map       131     33    75%
src/gm_api/resources/__init__        8      0   100%
src/gm_api/utils/__init__            0      0   100%
src/gm_api/utils/db                 13      3    77%
src/gm_api/utils/logs                7      0   100%
src/gm_api/utils/prefixes           14      0   100%
src/gm_api/utils/validate           21      2    90%
----------------------------------------------------
TOTAL                              372     58    84%
blankdots commented 7 years ago

BDD Tests are not perfect and require running applications but are there.

      Scenario: Add map and get its status          # features/cluster.feature:2
        Given graph API and Graph Store are running # ClusterIDs.java:23
        When I run a clusterids job                 # ClusterIDs.java:41
        Then I should get the status processed.     # ClusterIDs.java:58
    Feature: Handle mapping and indexing from the graph to Elasticsearch

org.uh.attx.gc.graphcomponent.test.stepdefinitions.TestRunner > Scenario: Retrieve mapping results.classMethod STANDARD_OUT

      Scenario: Add map and get its status                         # features/gmapi.feature:2
        Given graph API, Elasticsearch and Graph Store are running # GraphAPI.java:32
        When I post a mapping                                      # GraphAPI.java:57
        Then I should be able to retrieve status of mapping.       # GraphAPI.java:89

org.uh.attx.gc.graphcomponent.test.stepdefinitions.TestRunner > Scenario: Delete map results.classMethod STANDARD_OUT

      Scenario: Retrieve mapping results                            # features/gmapi.feature:7
        Given graph API, Elasticsearch and Graph Store are running  # GraphAPI.java:32
        When I post a new mapping                                   # GraphAPI.java:73
        And I retrieve that mapping                                 # GraphAPI.java:134
        Then I should be able to see the resource in Elasticsearch. # GraphAPI.java:119

org.uh.attx.gc.graphcomponent.test.stepdefinitions.TestRunner STANDARD_OUT

      #        TBD when we have a proper end to end workflow
      #        Then I should get indexed data in JSON-LD from the Elasticsearch.
      Scenario: Delete map results                             # features/gmapi.feature:15
        Given graph API is running                             # GraphAPI.java:105
        When I delete a mapping                                # GraphAPI.java:148
        But I should not be able to retrieve that mapping      # GraphAPI.java:170
        Then the mapping result still exists in Elasticsearch. # GraphAPI.java:183

    4 Scenarios (4 passed)
    14 Steps (14 passed)
    0m20.074s