apache / age

Graph database optimized for fast analysis and real-time data processing. It is provided as an extension to PostgreSQL.
https://age.apache.org
Apache License 2.0
3.16k stars 412 forks source link

Run MATCH with multiple edges query faster #2129

Open wgmayer0 opened 3 weeks ago

wgmayer0 commented 3 weeks ago

Is there a way to run this query faster? Maybe a technique I am overlooking? I see that I am running MATCH multiple times and not sure if that has anything to do with it

SELECT concat(a::text, ' / ', b::text, ' / ', c::text, ' / ', d::text, ' / ', e::text) AS concatenated_string
FROM cypher('hermech', $$ 
  MATCH (a:load_number)-[]-(b:origin) 
  MATCH (a)-[]-(c:pickup_time) 
  MATCH (a)-[]-(d:destination) 
  MATCH (a)-[]-(e:delivery_time) 
  MATCH (a)-[]-(d:destination) 
  MATCH (a)-[]-(e:delivery_time) 
  RETURN a.value, b.value, c.value, d.value, e.value
$$) AS (a agtype, b agtype, c agtype, d agtype, e agtype);

It seems to take about 4 seconds

bravius commented 2 weeks ago

You have a couple of duplicate match clauses, this would be more concise:

MATCH (a:load_number)-[]-(b:origin),
      (a)-[]-(c:pickup_time),
      (a)-[]-(d:destination),
      (a)-[]-(e:delivery_time)
RETURN a.value, b.value, c.value, d.value, e.value
wgmayer0 commented 2 weeks ago

Thanks. As far as performance (time until query completes) is there no improvement possible? If not maybe I can work on some cacheing solution where nodejs queries it in the bqckground so that its up to date

On Wed, Nov 13, 2024 at 7:13 PM bravius @.***> wrote:

You have a couple of duplicate match clauses, this would be more concise:

MATCH (a:load_number)-[]-(b:origin), (a)-[]-(c:pickup_time), (a)-[]-(d:destination), (a)-[]-(e:delivery_time) RETURN a.value, b.value, c.value, d.value, e.value

— Reply to this email directly, view it on GitHub https://github.com/apache/age/issues/2129#issuecomment-2475082095, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARTB5TPPVFTKHNLRS22LBJ32APTLHAVCNFSM6AAAAABRERBAEWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZVGA4DEMBZGU . You are receiving this because you authored the thread.Message ID: @.***>

uhayat commented 2 weeks ago

@MuhammadTahaNaveed do you think your PR (https://github.com/apache/age/pull/2117) will improve the query performance in this specific case as well ?

tronper123 commented 1 week ago

If your edges are all directed the same, you could add directed arrows,

MATCH (a:load_number)-[]->(b:origin),
      (a)-[]->(c:pickup_time),
      (a)-[]->(d:destination),
      (a)-[]->(e:delivery_time)
RETURN a.value, b.value, c.value, d.value, e.value

Also, if you use different labels for each connection type and specify the name, that will also improve the performance

MATCH (a:load_number)-[:orgin_label]-(b:origin),
      (a)-[:pickup_label]-(c:pickup_time),
      (a)-[:destination_label]-(d:destination),
      (a)-[:delivery_label]-(e:delivery_time)
RETURN a.value, b.value, c.value, d.value, e.value
wgmayer0 commented 1 week ago

Excellent feedback!! Thank you!

On Wed, Nov 20, 2024 at 6:13 PM tronper123 @.***> wrote:

If your edges are all directed the same, you could add directed arrows,

MATCH (a:load_number)-[]->(b:origin), (a)-[]->(c:pickup_time), (a)-[]->(d:destination), (a)-[]->(e:delivery_time) RETURN a.value, b.value, c.value, d.value, e.value

Also, if you use different labels for each connection type and specify the name, that will also improve the performance

MATCH (a:load_number)-[:orgin_label]-(b:origin), (a)-[:pickup_label]-(c:pickup_time), (a)-[:destination_label]-(d:destination), (a)-[:delivery_label]-(e:delivery_time) RETURN a.value, b.value, c.value, d.value, e.value

— Reply to this email directly, view it on GitHub https://github.com/apache/age/issues/2129#issuecomment-2489724062, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARTB5TLBPTLQG43Y5EZIDFL2BUJQRAVCNFSM6AAAAABRERBAEWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIOBZG4ZDIMBWGI . You are receiving this because you authored the thread.Message ID: @.***>