unipop-graph / unipop

Data Integration Graph
Apache License 2.0
203 stars 35 forks source link

Elasticsearch - Ids of nested edges and vertices #136

Open andreyxebialabs opened 5 years ago

andreyxebialabs commented 5 years ago

There is support for nested edges and nested vertices. But it requires some field to explicitly be id. Document's _id field is not accessible from nested context and anyway does not work. Specifying non-unique ids lead to weird behavior (seems, that documents having same id collide and only one survive).

How can we solve that?

I'd expect ids of nested relations to be generated from document _id plus some suffix, which can include either nested document index or some of nested document fields. So, if my document has _id = 'asdf', then subdoc may have something like 'asdf-relations[0]' or 'asdf-relations[myRelationType-myOtherNode]'

Examples:

Unipop Elastic mapping:

{ "class": "org.unipop.elastic.ElasticSourceProvider", "addresses": "http://localhost:9200", "vertices": [ { "index": "nodes", "id": "@_id", "label": "node", "properties": { }, "edges": [ { "index": "nodes", "path": "relations", "id": "@relatedNodeId", // this is not unique "label": "relation", "direction": "OUT", "properties": { }, "vertex":{ "ref": true, "id": "@relatedNodeId", "label": "node" } } ] } ] }

Document:

"_source": { "relations": [ { "relationType": "myRelationType", "relatedNodeId": "myOtherNode" } ] }

Thank you.