Open tenzin3 opened 2 months ago
TerminusDB was our first choice as it is a graph database that supports versioning with its data. However, their team has shifted focus to other projects, and due to the small community, we decided to move away from TerminusDB.
There are many graph database options available, but very few offer a free community edition. One that does, and has the largest community in the world, is Neo4j.
Another interesting option is Memgraph, which has following features
Cypher Languages Necessary codes for memgraph Lab
Show all entities: MATCH (n) RETURN n;
Show all entities with relation: MATCH (n)-[r]->(m) RETURN n, r, m;
Delete all data: MATCH (n) DETACH DELETE n;
from neo4j import GraphDatabase
URI = "bolt://localhost:7687"
AUTH = ("", "")
def insert_triplets(triplets):
with GraphDatabase.driver(URI, auth=AUTH) as driver:
with driver.session() as session:
for head, relation, tail in triplets:
session.run(
f"MERGE (h:Entity {{name: $head}}) "
f"MERGE (t:Entity {{name: $tail}}) "
f"MERGE (h)-[:{relation}]->(t)",
head=head, tail=tail
)
triplets = [
("DalaiLama", "WasBornIn", "Taktser"),
("Taktser", "isLocatedIn", "Dokham"),
("Dokham", "isPartOf", "Tibet"),
("Khampa", "LivesIn", "Dokham"),
("Dokham","DescendsTo","China"),
("DalaiLama","WasBornIn","WoodHogYear"),
("AmiChiri","IsSouthOf","Taktser"),
]
insert_triplets(triplets)
from neo4j import GraphDatabase
URI = "bolt://localhost:7687"
AUTH = ("", "")
def fetch_data():
with GraphDatabase.driver(URI, auth=AUTH) as driver:
with driver.session() as session:
result = session.run("MATCH (h)-[r]->(t) RETURN h.name, type(r), t.name")
for record in result:
print(record["h.name"], record["type(r)"], record["t.name"])
fetch_data()
Data is from here
from neo4j import GraphDatabase
URI = "bolt://localhost:7687"
AUTH = ("", "")
def insert_triplets(data):
with GraphDatabase.driver(URI, auth=AUTH) as driver:
with driver.session() as session:
# Insert nodes
for node in data['nodes']:
entity_type = node["type"]
properties = node.get('attributes', {})
properties['name'] = node['label']
session.run(f"CREATE (n:{entity_type} $props)", {'props': properties})
# Insert edges
for edge in data['edges']:
source = edge['source']
target = edge['target']
relation = edge['relation']
session.run(
f"MATCH (a {{name: $source}}), (b {{name: $target}}) "
f"CREATE (a)-[:{relation}]->(b)",
{'source': source, 'target': target}
)
import json
with open('kg_data.json', 'r') as file:
data = json.load(file)
insert_triplets(data)
@teny19 suggestions:> Methods to clean the knowledge graph
Test for 3-5 pages initially to test the methods and then if satisfactory then going ahead for the 1 chapter and then for whole book.
Description
This project involves the population of a knowledge graph within a graph database. The aim is to store triples and structured data, which represent entities and their relationships, into the graph database.
Expected Output
Implementation Plan