carnival-data / carnival

JVM property graph data unification framework
https://carnival-data.github.io/carnival/
GNU General Public License v3.0
7 stars 2 forks source link

Clarity rename and re-factor #59

Open augustearth opened 3 years ago

augustearth commented 3 years ago

Carnival 2 was a major step in the right direction. The API had come a long way, functionality has mostly been grouped intelligently. However, there are still some confusing nomenclature and awkward constructions. I propose the following mostly cosmetic refactor to increase overall clarity.

CoreGraph -> Carnival

CoreGraph contains a schema, a validator, and a TinkerPop graph. Why not take the opportunity to brand a "CoreGraph" as a "Carnival", which is as good a self contained unit that represents Carnival as anything we have in the API. Rather than pass around Tinkerpop graphs for Carnival functionality, we can pass around the Carnival itself. Code would look something like the following:

def c = Carnival.create(NEO4J)
MyClass.method('MyGraphMethod')
    .arguments(a1:v1, a2:v2)
.ensure(c)

Carnival snapshots

Based on the functionality that already exists in CoreGraph, we can already implement snapshotting that makes snapshot copies of a graph at any point in time. I have used this functionality in local code to implement a snapshot mechanism that creates named snapshots used by a top level script that picks up where things left off based on presence of snapshots. It has proven very useful. I propose making this core functionality. Something like:

Normal Operation

def c = Carnival.create(NEO4J)
def s1 = c.snapshot()
MyClass.method('MyGraphMethod')
    .arguments(a1:v1, a2:v2)
.ensure(c)
if (/*we do not like what we see*/) c.revert(s1)

Expensive Data Pipelines

def c = Carnival.create(NEO4J)
c.snapshot('stage1') {
    // load some graph data
}

c.snapshot('stage2') {
    // load some more graph data
}

c.snapshot('stage3') {
    // load the rest of the graph data
}

// do other stuff

The developer realizes something went wrong with stage3. The snapshot directory for stage3 can be deleted, the code updated, and pipeline can be rerun. Since the snapshots for stages 1 and 2 exist, the expensive closures will not be rerun.

Vine -> Rodeo

The vine functionality is really quite separate from the Carnival graph elements. We should probably make the more clear in the API. Further, the "vine" nomenclature is fine, but we have not kept up with the theme. I propose factoring the vine functionality into a new component 'carnival-rodeo' and moving away from the vine theme.

class MyApiRodeo implements Rodeo {
    class WrangleHorses extends JsonLasso<Horse> {
        Horse fetch(Map args = [:]) {
        ...
        }
    }
}

def myApiRodeo = new MyApiRodeo()
def horses = myApiRodeo.lasso('WrangleHorses')
    .cacheMode(IGNORE)
.toss()