bricaud / graphexp

Interactive visualization of the Gremlin graph database with D3.js
Apache License 2.0
780 stars 215 forks source link

Add an option to "approximate" the graph info widget #48

Closed k4rthikr closed 1 year ago

k4rthikr commented 6 years ago

https://github.com/bricaud/graphexp/blob/fe930a9b78d561b0afedc98b4a28cbdfc777e70a/scripts/graphioGremlin.js#L56-L59

These queries are often expensive, and would need full scan over databases if those queries are not internally optimized. In my use case, I had to put a limit on these queries, because I know how my data looks, and 10k entries were enough for me to get an idea of the various labels in my system. Thats most likely true for many others as well, so does it make sense to introduce a config option to "approximateGraphInfo", which can be turned ON, and update the queries to be something like:

if (approximateGraphInfo) {
g.V().limit(LIMIT_VERTEX_COUNT_FOR_GRAPH_INFO_WIDGET).groupCount().by(label);
g.E().limit(LIMIT_EDGE_COUNT_FOR_GRAPH_INFO_WIDGET).groupCount().by(label);
..
}
else {
// retain the current code
..
}
k4rthikr commented 6 years ago

Another alternative would be update the query from groupCount to just dedupe(), not sure if the counts are really that helpful in the side widget.

bricaud commented 6 years ago

Yes, replacing by dedup() is a good idea. I am wondering if it is faster or it also scans the full graph.

k4rthikr commented 5 years ago

Yeah dedup() mostly would do fullscan, unless implementors have some approx counts being tracked. I resorted to applying a hard limit as proposed in my first comment.