Closed steph-lion closed 2 years ago
Hi @steph-lion , interesting!
I don't think we've seen virtual node/relns so far, so not exactly surprised they didn't work out of the box. If so, should be a surprisingly low lift as our neo4j_bolt_graph->dataframe conversion portion is only ~100 loc, so happy to help figure out, see below.
Also, just to help prioritize here, feel free to ping in Slack with any more sensitive info etc
label_name=input()
query="..."
g = graphistry.cypher(query)
print(g._nodes)
print(g._edges)
cypher()
call invokes the driver and then convert the bolt-format results to _nodes
and _edges
pandas/arrow data frames: https://github.com/graphistry/pygraphistry/blob/bf99c1827510e98ea15cbd745f3b6755feee3ac3/graphistry/PlotterBase.py#L1802Something like
from neo4j import GraphDatabase, Driver
driver = GraphDatabase.driver(...)
from local_copy_of_those_snippets import bolt_graph_to_edges_dataframe, bolt_graph_to_nodes_dataframe
with driver.session() as session:
bolt_statement = session.run(query, **params)
graph = bolt_statement.graph()
edges = bolt_graph_to_edges_dataframe(graph)
nodes = bolt_graph_to_nodes_dataframe(graph)
nodes_df = ...
edges_df = ...
g = graphistry.nodes(nodes_df, 'my_id_col').edges(edges_df, 'my_src_col', 'my_dest_col')
g.plot()
But of course it'd be better to natively support, so I'm curious on 1-3, or if you can get a flow working for 4 :)
(For tracking: if/when we confirm it's virtual types, will file a parallel ticket for tracking an enhancement to support virtual types)
Hi, thanks for the answer, this is also my first github issue I make, it seems I got a good one!
I'm trying what you said.
graphistry.cypher("call apoc.create.vNode(['Greeting'],{greeting:'Hello!'}) yield node with node as n1 call apoc.create.vNode(['Greeting'],{greeting:'Hi!'}) yield node with n1, node as n2 return apoc.create.vRelationship(n1,'SIMILAR_TO',{},n2) as rel,n1,n2").plot()
https://hub.graphistry.com/graph/graph.html?dataset=3f0046843aae4681a3d8b57efbeed634
:)
Ok this is helping a lot, thank you!
Also, I just realized I misread the issue, I think the initial main issue is more of a visual settings thing where you wanted to be zoomed in more so you could see the edges. You can bring the elements closer together and it'll autozoom :
g = (
graphistry
.cypher(query)
.settings(url_params={
"strongGravity": "true"
})
)
g.plot()
More options at: https://hub.graphistry.com/docs/api/1/rest/url/
Of interest are also gravity
and pointSize
. These correspond to values in the setting's UI panel, which you can play with and then bake in. For bigger graphs, you'll probably want different settings, so an option is picking based on graph size (len(g._edges) + len(g._nodes)
)
The additional item was about pyarrow.lib.ArrowTypeError: ("Expected bytes, got a 'int' object", 'Conversion failed for column name with type object')
. I'm not sure either -- Arrow
expects everything of the same field name to be the same type, so when in doubt, I'd start with checking values in ._edges
/ ._nodes
and worst case, converting to strings:
cleaned_nodes_df = g._nodes.assign(some_col=g._nodes['some_col'].astype(str))
g2 = g.nodes(cleaned_nodes_df)
g2.plot()
But that assumes it's a data typing issue and am not sure :)
Well, right now I just added the setting "strongGravity":"true"
and it seems that worked, at I least I can see edges now. I hope it will get fixed also for thousands of nodes together, since I need to plot some clusters.
Here is the result: https://hub.graphistry.com/graph/graph.html?dataset=508115478cb243eb93af529afce6273e
Is there any channel where I can ask "stupid" questions about graphistry? I have a lot ones and StackOverflow is not my friend in this case... Thanks for the help to this ticket!
EDIT: With 100 nodes still same problem: https://hub.graphistry.com/graph/graph.html?dataset=a4ad840515bc4b6fba7b999b99e611c3
GitHub is great even for simple stuff, others can search this too, so it's a useful help to future folk :)
The Slack channel's #help is great too, and for bigger / work stuff, our ZenDesk
.settings(url_params={
"strongGravity": "true",
"pointSize": 0.3
})
Seems to do it for the bigger link afaict . This kind of stuff gets specific to different graphs, so we do automate a bunch of it (you'll notice relative size changes as you zoom in/out), some last-mile tweaks do help in practice, esp. for extreme cases
Closing:
pointSize
to be smaller via one of the APIs
Describe the bug I'm using Neo4J and Python driver with pygraphistry to plot my results. I can't say if this is a bug, but when I run APOC Cypher queries, graphistry plotting result is not what I expected. I get what I expect with classical "MATCH" queries, but not with APOC virtual nodes and virtual relationships. I'm trying to manually build clustering algorithms because Neo4J doesn't give back a graph object when I execute them, so I created this query so I can plot something visible.
To Reproduce This is my code for the query:
Expected behavior Something like Neo4J plotting:
Actual behavior What did happen: https://hub.graphistry.com/graph/graph.html?dataset=3e1bb1d25e374bb290c4846783983eb2 Here you can see the plot. The couples are all attached and I need to move them manually to make the edge appear between them. But this is an example with 10 couples, but I'm going to plot something like 750k nodes and I can't move them manually to see clusters results.
Browser environment (please complete the following information):
Graphistry GPU server environment
PyGraphistry API client environment
Additional context I'm sorry but I'm learning Neo4j and pygraphistry for the first time, so I don't know how to print clusters algorithms. Also, if I change {component:component} into {name:component} I get a python error:
pyarrow.lib.ArrowTypeError: ("Expected bytes, got a 'int' object", 'Conversion failed for column name with type object')
I don't know what that is, so I changed "name" to "component".