Open kmcginnes opened 4 months ago
Gremlin also seems to have issues when limiting the neighbors to expand in the side bar.
This is the Gremlin generated to find 4 movies (limited to a max of 4) connected to Sean Connery:
g.V(\"nm0000125\").
project(\"vertices\", \"edges\").
by(both().hasLabel(\"movie\").dedup().range(0, 4).fold()).
by(bothE().where(otherV().hasLabel(\"movie\")).dedup().fold())
Which returns the following and includes a lot of extra edges (75 in all) as there is no limit placed on the second by
{
'vertices': [v[tt0094226], v[tt0102034], v[tt0058150], v[tt0081633]],
'edges': [
e[tt0094226-actor-nm0000125][tt0094226-actor->nm0000125],
e[tt0102034-actor-nm0000125][tt0102034-actor->nm0000125],
e[tt0058150-actor-nm0000125][tt0058150-actor->nm0000125],
e[tt0081633-actor-nm0000125][tt0081633-actor->nm0000125],
e[tt0107969-actor-nm0000125][tt0107969-actor->nm0000125],
e[tt0070948-actor-nm0000125][tt0070948-actor->nm0000125],
e[tt0145734-actor-nm0000125][tt0145734-actor->nm0000125],
e[tt0075784-actor-nm0000125][tt0075784-actor->nm0000125],
e[tt0097328-actor-nm0000125][tt0097328-actor->nm0000125],
e[tt0117500-actor-nm0000125][tt0117500-actor->nm0000125],
e[tt0113071-actor-nm0000125][tt0113071-actor->nm0000125],
e[tt0058754-actor-nm0000125][tt0058754-actor->nm0000125],
e[tt0066995-actor-nm0000125][tt0066995-actor->nm0000125],
e[tt0057076-actor-nm0000125][tt0057076-actor->nm0000125],
e[tt0055262-actor-nm0000125][tt0055262-actor->nm0000125],
e[tt0073796-actor-nm0000125][tt0073796-actor->nm0000125],
e[tt0084750-actor-nm0000125][tt0084750-actor->nm0000125],
e[tt0071877-actor-nm0000125][tt0071877-actor->nm0000125],
e[tt0051364-actor-nm0000125][tt0051364-actor->nm0000125],
e[tt0091605-actor-nm0000125][tt0091605-actor->nm0000125],
e[tt0084920-actor-nm0000125][tt0084920-actor->nm0000125],
e[tt0074962-actor-nm0000125][tt0074962-actor->nm0000125],
e[tt0066767-actor-nm0000125][tt0066767-actor->nm0000125],
e[tt0052722-actor-nm0000125][tt0052722-actor->nm0000125],
e[tt0099810-actor-nm0000125][tt0099810-actor->nm0000125],
e[tt0067315-actor-nm0000125][tt0067315-actor->nm0000125],
e[tt0104839-actor-nm0000125][tt0104839-actor->nm0000125],
e[tt0075147-actor-nm0000125][tt0075147-actor->nm0000125],
e[tt0079550-actor-nm0000125][tt0079550-actor->nm0000125],
e[tt0109920-actor-nm0000125][tt0109920-actor->nm0000125],
e[tt0118661-actor-nm0000125][tt0118661-actor->nm0000125],
e[tt0091203-actor-nm0000125][tt0091203-actor->nm0000125],
e[tt0083947-actor-nm0000125][tt0083947-actor->nm0000125],
e[tt0097576-actor-nm0000125][tt0097576-actor->nm0000125],
e[tt0082869-actor-nm0000125][tt0082869-actor->nm0000125],
e[tt0055928-actor-nm0000125][tt0055928-actor->nm0000125],
e[tt0058329-actor-nm0000125][tt0058329-actor->nm0000125],
e[tt0116136-actor-nm0000125][tt0116136-actor->nm0000125],
e[tt0181536-actor-nm0000125][tt0181536-actor->nm0000125],
e[tt0113501-actor-nm0000125][tt0113501-actor->nm0000125],
e[tt0073341-actor-nm0000125][tt0073341-actor->nm0000125],
e[tt0137494-actor-nm0000125][tt0137494-actor->nm0000125],
e[tt0086006-actor-nm0000125][tt0086006-actor->nm0000125],
e[tt0070468-actor-nm0000125][tt0070468-actor->nm0000125],
e[tt0062512-actor-nm0000125][tt0062512-actor->nm0000125],
e[tt0084011-actor-nm0000125][tt0084011-actor->nm0000125],
e[tt0054898-actor-nm0000125][tt0054898-actor->nm0000125],
e[tt0095897-actor-nm0000125][tt0095897-actor->nm0000125],
e[tt0073906-actor-nm0000125][tt0073906-actor->nm0000125],
e[tt0311429-actor-nm0000125][tt0311429-actor->nm0000125],
e[tt0066090-actor-nm0000125][tt0066090-actor->nm0000125],
e[tt0060414-actor-nm0000125][tt0060414-actor->nm0000125],
e[tt0063592-actor-nm0000125][tt0063592-actor->nm0000125],
e[tt0100530-actor-nm0000125][tt0100530-actor->nm0000125],
e[tt0079013-actor-nm0000125][tt0079013-actor->nm0000125],
e[tt0059800-actor-nm0000125][tt0059800-actor->nm0000125],
e[tt0059274-actor-nm0000125][tt0059274-actor->nm0000125],
e[tt0079240-actor-nm0000125][tt0079240-actor->nm0000125],
e[tt1527819-actor-nm0000125][tt1527819-actor->nm0000125]
]
}
here is how it could also be written (simpler and no extra edges)
g.V("nm0000125").
bothE().as('e').
otherV().
hasLabel('movie').as('n').
limit(4).
select('n','e')
which returns
{'n': v[tt0094226], 'e': e[tt0094226-actor-nm0000125][tt0094226-actor->nm0000125]}
{'n': v[tt0102034], 'e': e[tt0102034-actor-nm0000125][tt0102034-actor->nm0000125]}
{'n': v[tt0081633], 'e': e[tt0081633-actor-nm0000125][tt0081633-actor->nm0000125]}
{'n': v[tt0058150], 'e': e[tt0058150-actor-nm0000125][tt0058150-actor->nm0000125]}
Community Note
Description
When performing neighbor expansion, the generated queries can be quite complex and resource intensive. Some create 50 or more network requests and can overwhelm the database server where the requests will begin to fail.
To Reproduce
Gremlin
I selected one node, and expanded 1 neighbor.
This second query will be repeated for each neighbor being expanded. So if you have 50 neighbors, you would have 50 requests.
openCypher
Expanding an author with 6 posts:
One request per neighbor, so 6 in this case:
SPARQL
Since the RDF data was different than PG data in the same Neptune instance, this is a single Airpot node expanding a single Country node. This resulted in 3 queries.
Expected behavior
Each language should be optimized. The query seems to be retrieving data for multiple parts of the app (graph and sidebar). We can split those tasks to further optimize, deferring the side bar work until needed.
Related issues
184
Tasks