tigergraph / gsql-graph-algorithms

GSQL Graph Algorithms
https://docs.tigergraph.com/graph-ml/current/intro/
Apache License 2.0
192 stars 70 forks source link

Wcc Algorithm #150

Closed narayana-dev closed 4 months ago

narayana-dev commented 4 months ago

Hi all, I explored the WCC standard algorithm but the below snippet is bothering me a bit from algorithms/Community/connected_components/weakly_connected_components/standard/tg_wcc.gsql

# Propagate smaller internal IDs until no more ID changes can be Done
WHILE (S.size()>0) DO
    S = SELECT t
        FROM S:s -(e_type_set:e)- v_type_set:t
    ACCUM t.@min_cc_id += s.@min_cc_id // If s has smaller id than t, copy the id to t
    HAVING t.@min_cc_id != t.@min_cc_id';
END;
  1. Why do we need having clause here?
    • As per my understanding, if there is an edge from s to t then we always change t.@min_cc_id right? so checking whether min_cc_id changed or not feels redundant.
  2. What does the comment // If s has smaller id than t, copy the id to t imply?
    • Where are we checking whether s's id is smaller than t's id? is this is bug in the code or comment?

I found this snippet in WCC small world algorithm from algorithms/Community/connected_components/weakly_connected_components/small_world/tg_wcc_small_world.gsql

WHILE Vertices.size() > 0 DO
      Vertices = SELECT t 
                 FROM Vertices:s-(e_type:e)-v_type:t
                 WHERE  s.@min_cc_id < t.@min_cc_id
                 ACCUM  t.@min_cc_id += s.@min_cc_id;
  END;

If that comment was here it makes sensible. I thought this algo was used inside tiger graph since it was open source. Now it feels like this is just an open source copy and actual thing is different.

narayana-dev commented 4 months ago

Sorry about that, I think it varies as part of internal implementation.