RedisGraph / redisgraph-py

RedisGraph python client
https://redisgraph.io
BSD 3-Clause "New" or "Revised" License
189 stars 49 forks source link

Hanging Commit #13

Open stevencox opened 5 years ago

stevencox commented 5 years ago

@swilly22 - this code hangs on the commit():

        id2node = {}
        for n in in_graph.nodes (data=True):
            #print (f"{n}")
            i = n[0]
            attr = n[1]['attr_dict']
            id2node[i] = out_graph.add_node (label=attr['type'], props=attr)
        for e in in_graph.edges (data=True):
            #print (f"{e}")
            attr = e[2]['attr_dict']
            subj = id2node[e[0]]
            pred = attr['type']
            obj = id2node[e[1]]
            print (f"subj: {subj}")
            print (f"  pred: {pred}")
            print (f"    obj: {obj}")
            out_graph.add_edge (subj=id2node[e[0]],
                                pred=attr['type'],
                                obj=id2node[e[1]],
                                props=attr)
        print (f"---------> c2")
        out_graph.commit ()

where out_graph is an instance of this:

import redis
from redisgraph import Node, Edge, Graph

class KnowledgeGraph:
    ''' Encapsulates a knowledge graph. '''
    def __init__(self, graph, graph_name, host='localhost', port=6379):
        ''' Connect to Redis. '''
        self.redis = redis.Redis(host=host, port=port)
        self.graph = Graph(graph_name, self.redis)
    def add_node (self, label, props):
        ''' Add a node to the graph. '''
        n = Node(label=label, properties=props)
        self.graph.add_node (n)
        return n
    def add_edge (self, subj, pred, obj, props):
        ''' Add an edge. '''
        e = Edge(subj, pred, obj, properties=props)
        self.graph.add_edge (e)
        return e
    def commit (self):
        ''' Commit changes. '''
        self.graph.commit ()
    def query (self, query):
        ''' Query the graph. '''
        return self.graph.query (query)
    def delete (self):
        ''' Delete the entire graph. '''
        self.graph.delete ()

So, am I doing something wrong with regard to the usage of commit()?

The code above prints a bunch of nodes and edges, then:

subj: (ymmkesnfok:protein{description:"The liver X receptors, LXRA (NR1H3; MIM 602423) and LXRB, form a subfamily of the nuclear receptor superfamily and are key regulators of macrophage function, controlling transcriptional programs involved in lipid homeostasis and inflammation. The inducible LXRA is highly expressed in liver, adrenal gland, intestine, adipose tissue, macrophages, lung, and kidney, whereas LXRB is ubiquitously expressed. Ligand-activated LXRs form obligate heterodimers with retinoid X receptors (RXRs; see MIM 180245) and regulate expression of target genes containing LXR response elements (summary by Korf et al., 2009 [PubMed 19436111]).[supplied by OMIM, Jan 2010].",id:36,name:"UniProtKB:P55055",type:"protein",uri:"http://identifiers.org/uniprot/P55055"})
  pred: physically_interacts_with
    obj: (ftymebehpu:chemical_substance{description:"A fluorocarbon that is propane in which all of the hydrogens have been replaced by fluorines.",id:37,name:"CHEMBL.COMPOUND:CHEMBL1663",node_attributes:{'annotate': {'common_side_effects': None, 'approval': 'Yes', 'indication': 'Echocardiography', 'EPC': 'Contrast Agent for Ultrasound Imaging'}},type:"chemical_substance",uri:"https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1663"})
---------> c2

where it hangs indefinitely.

FWIW, I'm using the docker container version of redisgraph on OSX.

On a related note, the first snippet is a nearly general insert for NetworkX, related to issue #4. It would be pretty great if redisgraph-py supported NetworkX natively in both directions.

swilly22 commented 5 years ago

Hi @stevencox, Could you give an estimation on the amount of nodes and edges you're trying to create within a single call to commit I would start of with a small graph and see if it still hangs.

Also it might be useful to use Redismonitor command to see the exact query which get send by redisgraph-py,

I'm not familiar with NetworkX, I would need to find some time and see what it's all about.

stevencox commented 5 years ago

Hi @swilly22, Thanks for getting back to me.

Graph size:

nodes 988
edges 888

With incremental commits at a batch size of 10, it seems to hang on the first commit:

    def to_knowledge_graph (self, in_graph, out_graph):
        ''' Convert a networkx graph to Ros KnowledgeGraph. '''
        id2node = {}
        print (f"---------> c0")
        for j, n in enumerate(in_graph.nodes (data=True)):
            i = n[0]
            attr = n[1]['attr_dict']
            id2node[i] = out_graph.add_node (label=attr['type'], props=attr)
            if j % 10 == 0:
                print ("commit")
                out_graph.commit ()
        out_graph.commit ()
        print (f"---------> c1")
        for i, e in enumerate(in_graph.edges (data=True)):
            attr = e[2]['attr_dict']
            subj = id2node[e[0]]
            pred = attr['type']
            obj = id2node[e[1]]
            print (f"subj: {subj}")
            print (f"  pred: {pred}")
            print (f"    obj: {obj}")
            out_graph.add_edge (subj=id2node[e[0]],
                                pred=attr['type'],
                                obj=id2node[e[1]],
                                props=attr)
            if i % 10 == 0:
                out_graph.commit ()
        out_graph.commit ()
        print (f"---------> c2")

But playing with the batch size, it seems to hang after around 6 objects. i.e. it will do five commits or so of single objects or two commits with a batch size of 3, for example.

I restarted the docker container to be sure it wasn't hung in some way.

Any suggestions welcome.

swilly22 commented 5 years ago

Alright so the graph is quite small, the fact that you're able to commit in batches, feels like there might be an escaping issue, and so if you could either provide me with Redis monitor output or the actual raw data I'll be able to help further.