snowplow / enrich

Snowplow Enrichment jobs and library
https://snowplowanalytics.com
Other
21 stars 39 forks source link

Common: explore graph extensions to Avro enriched event #147

Open chuwy opened 4 years ago

chuwy commented 4 years ago

We may want to leap straight to this approach. The basic idea would be to express relationships between entities in the event as typed edges.

Need to review Nathan Marz's chapter on modeling graphs in Thrift in Big Data.

Here is a simple example (using self-describing JSON instead of Avro for simplicity):

{
  "nodes": {
   "n0": {
      "schema": "iglu:com.acme/player/jsonschema/1-0-0",
      "data": {
        "username": "bob",
        "alliance": "pirates"
      }
    },
    "n1": {
      "schema": "iglu:com.acme/player/jsonschema/1-0-0",
      "data": {
        "username": "sue",
        "alliance": "zombies"
      }
    },
    "n2": {
      "schema": "iglu:com.acme/kills/jsonschema/1-0-0",
      "data": {}
    },
    "n3": {
      "schema": "iglu:com.acme/level/jsonschema/1-0-0",
      "data": {
        "name": "blood gulch",
        "difficulty": 23
      }
    },
  },
  "edges": {
    "e0": {
      "schema": "iglu:com.snowplowanalytics.snowplow/verb-subject/jsonschema/1-0-0",
      "source": "n0",
      "target": "n1"
    },
   "e1":  {
      "schema": "iglu:com.snowplowanalytics.snowplow/verb-direct-object/jsonschema/1-0-0",
      "source": "n1",
      "target": "n2"
    },
   "e2":  {
      "schema": "iglu:com.acme/at-geographic/jsonschema/1-0-0",
      "source": "n2",
      "target": "n3"
    }
  }
}
chuwy commented 4 years ago

Migrated from https://github.com/snowplow/snowplow/issues/1639 (comments are auto-generated)