RapidsAtHKUST / SubgraphMatching

In-Memory Subgraph Matching: An In-depth Study by Dr. Shixuan Sun and Prof. Qiong Luo
MIT License
133 stars 36 forks source link

Question about the FilterVertices::GQLFilter #4

Open s1ck opened 3 years ago

s1ck commented 3 years ago

In the paper, you explain the GraphQL filtering method as a two step process: local pruning and global refinement. For local pruning, the paper describes the method of using the "profile", i.e. the lexicographical ordering of the neighbor labels. In the implementation, you opt for the NLF filter instead (which degrades to a LDF filter if the OPTIMIZED_LABELED_GRAPH is disabled). Is there any particular reason for that decision? If it's mentioned in the paper, I might have missed it. Thanks!

shixuansun commented 3 years ago

In our experiments, we follow the parameter configuration in the original paper and set the radius of the profile as 1 because a large radius value can dramatically increase the time of building the profile, for example, when setting the radius as 2, the time complexity is $O(\sum_{v \in V}d_v ^ 2)$.

If the radius is 1, the pruning power of NLF is the same as the profile. For example, suppose that the profile of a data vertex v is AABBBCCCC. Then, its NLF filter would be {(A:2), (B:3), (C:4)}. Given a query vertex u, if v passes the NLF filter of u, then for each label l in the profile of u, the frequency of l in the profile of u is no greater than that of l in the profile of v, which means that the profile of u is a subsequence of the profile of v. As such, in our implementation, we directly use NLF filter instead of calculating the profiles of each vertex.

s1ck commented 3 years ago

Thanks for detailed the explanation. That makes perfect sense.