Multiple requests to neo4j in some places

ugurdogrusoz / visuall

Visuall: A tool for convenient construction of a web based visual analysis component

2 stars 0 forks source link

Multiple requests to neo4j in some places #301

Closed canbax closed 3 years ago

canbax commented 4 years ago

When we execute a query inside "Query By Rule" we will make 2 HTTP requests to the neo4j database server. One of them is to get the count of elements for the current query. The other one is to bring data as a table.

If the graph option is checked, we even make a third query to bring data as a graph to cytoscape.js. In fact, we can use the previous data of the table to generate graph elements. This is rather trivial.

When we bring some data which satisfies condition(s) of a "query", we only show the data partially because there might be millions of results. For pagination, we make an extra HTTP request to the neo4j database to learn the total number of elements that satisfy the rules of the "query".

Maybe we can use a single HTTP request to the server which returns partial results and also the total number of elements.

canbax commented 4 years ago

Below is an example cypher query that returns both the total count of results and a slice of the results.

MATCH (x:Person)
WHERE x.birth_year >= 1990 AND x.death_year <= 2020
RETURN collect(ID(x))[10..13], collect(x)[10..13], length(collect(x))

canbax commented 4 years ago

I executed 7 different queries on a database large database (10M edges, 6M nodes)

Execution times can be seen https://docs.google.com/document/d/167WWr3vhxcfXiyhWj1MLruZpe4ow4z7SYNZ2nOffJiI/edit

Based on average execution times, I can say the single HTTP request method is slightly faster

Also, I observed that a very simple query such as "get all edges of a particular type" will result in 90 seconds in 3 HTTP requests but 40 seconds in the single HTTP request version.

ugurdogrusoz commented 4 years ago

Looking at the results, looks like we do sometimes get significant improvements with a single call. Let's try to do that everywhere (Query by Rule, General Queries and Custom Queries).

canbax commented 4 years ago

There are 3 places that we request data from the database with HTTP requests.

"Map" > "Query By Rule" with "Database" checked. Here, it was making 2 or 3 HTTP requests. Now, we make 1 HTTP request.
"Database" > "General Queries". Here, it was making 2 HTTP requests. Now, we make 1 HTTP request.
"Database" > "Custom Queries". Here, it was making 2 or 3 HTTP requests. Now, we make 1 or 2 HTTP requests. In custom queries, we load a graph and a table. In this context, the table and graph represent different data. For example, the table will show total amount and total count but the graph does not care about these values. We can try to bring both graph data and table data in one HTTP request but this kind of cypher query might be too complex. Also, every custom query is a manually written cypher query. Each query might have an arbitrarily different number of parameters and results. So each cypher query should be changed. This could be hard and tiring.

ugurdogrusoz commented 4 years ago

Good job. Thanks for nicely summarizing what's been done as well.