DerwenAI / kglab

Graph Data Science: an abstraction layer in Python for building knowledge graphs, integrated with popular graph libraries – atop Pandas, NetworkX, RAPIDS, RDFlib, pySHACL, PyVis, morph-kgc, pslpython, pyarrow, etc.
https://derwen.ai/docs/kgl/
MIT License
574 stars 65 forks source link

kglab.KnowledgeGraph().visualize_query breaks when query contains LIMIT 1000 #280

Closed dmoore247 closed 1 year ago

dmoore247 commented 1 year ago

I'm submitting a

Current Behaviour:

While kg.query_as_df(sparql) works as expected kg.visualize_query(sparql) throws exception when 'LIMIT 1000' is in query text. Remove the LIMIT 1000 clause and no error, visual is produced

Note that in the 2nd to last (or is it first) level of the stack traceback, in result.extend(self._find_triples(algebra[key])) The key value is 'start'

Exception

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<command-2674171610427751> in <cell line: 1>()
----> 1 pviz = kg.visualize_query(sparql)
      2 displayHTML(pviz.generate_html())

/local_disk0/.ephemeral_nfs/envs/pythonEnv-1e893254-e5e7-4ca5-be10-19833a70c739/lib/python3.9/site-packages/kglab/kglab.py in visualize_query(self, sparql, notebook)
   1131 PyVis network object, to be rendered
   1132         """
-> 1133         return GPViz(sparql, self._ns).visualize_query(notebook=notebook)
   1134 
   1135 

/local_disk0/.ephemeral_nfs/envs/pythonEnv-1e893254-e5e7-4ca5-be10-19833a70c739/lib/python3.9/site-packages/kglab/gpviz.py in __init__(self, sparql, namespaces)
     82         self.blank_nodes: typing.List[str] = []
     83         self.values: typing.Dict[str, list] = collections.defaultdict(list)
---> 84         self.triples: list = self._find_triples(pq.algebra)
     85 
     86 

/local_disk0/.ephemeral_nfs/envs/pythonEnv-1e893254-e5e7-4ca5-be10-19833a70c739/lib/python3.9/site-packages/kglab/gpviz.py in _find_triples(self, algebra)
    164         for algebra_node in akg:
    165             for key in dict(algebra_node).keys():
--> 166                 self._find_triples_node(algebra_node, key, result)
    167 
    168         return result

/local_disk0/.ephemeral_nfs/envs/pythonEnv-1e893254-e5e7-4ca5-be10-19833a70c739/lib/python3.9/site-packages/kglab/gpviz.py in _find_triples_node(self, algebra, key, result)
    139                 result.extend([ algebra.triples ])
    140             else:
--> 141                 result.extend(self._find_triples(algebra[key]))
    142 
    143 

/local_disk0/.ephemeral_nfs/envs/pythonEnv-1e893254-e5e7-4ca5-be10-19833a70c739/lib/python3.9/site-packages/kglab/gpviz.py in _find_triples(self, algebra)
    164         for algebra_node in akg:
    165             for key in dict(algebra_node).keys():
--> 166                 self._find_triples_node(algebra_node, key, result)
    167 
    168         return result

/local_disk0/.ephemeral_nfs/envs/pythonEnv-1e893254-e5e7-4ca5-be10-19833a70c739/lib/python3.9/site-packages/kglab/gpviz.py in _find_triples_node(self, algebra, key, result)
    139                 result.extend([ algebra.triples ])
    140             else:
--> 141                 result.extend(self._find_triples(algebra[key]))
    142 
    143 

/local_disk0/.ephemeral_nfs/envs/pythonEnv-1e893254-e5e7-4ca5-be10-19833a70c739/lib/python3.9/site-packages/kglab/gpviz.py in _find_triples(self, algebra)
    163 
    164         for algebra_node in akg:
--> 165             for key in dict(algebra_node).keys():
    166                 self._find_triples_node(algebra_node, key, result)
    167 

TypeError: 'int' object is not iterable

kg.visualize_query(sparql) breaks when query contains LIMIT 1000

Expected Behaviour:

The query generates a small visual of the query.

Steps to reproduce:

kg = kglab.KnowledgeGraph().load_rdf("myfile.rdf")

sparql = """
SELECT  ?s ?p ?o
WHERE {
    ?s ?p ?o
}
LIMIT 1000
"""

kg.visualize_query(sparql)

Environment:

`kglab.version.__version__` '0.6.1' * python version `'3.9.5 (default, Nov 23 2021, 15:27:38) \n[GCC 9.3.0]'` * pip version `pip 21.2.4` * OS details `Linux xxxxx 5.4.0-1086-aws #93~18.04.1-Ubuntu.` (Databricks DBR 11.3 LTS ML)
Mec-iS commented 1 year ago

Thanks for reporting this.

Mec-iS commented 1 year ago

should be fixed by #282

ceteri commented 1 year ago

Nice work @Mec-iS , @dmoore247 does this work well for you now?