Open s1ck opened 3 years ago
In our experiments, we follow the parameter configuration in the original paper and set the radius of the profile as 1 because a large radius value can dramatically increase the time of building the profile, for example, when setting the radius as 2, the time complexity is $O(\sum_{v \in V}d_v ^ 2)$.
If the radius is 1, the pruning power of NLF is the same as the profile. For example, suppose that the profile of a data vertex v is AABBBCCCC. Then, its NLF filter would be {(A:2), (B:3), (C:4)}. Given a query vertex u, if v passes the NLF filter of u, then for each label l in the profile of u, the frequency of l in the profile of u is no greater than that of l in the profile of v, which means that the profile of u is a subsequence of the profile of v. As such, in our implementation, we directly use NLF filter instead of calculating the profiles of each vertex.
Thanks for detailed the explanation. That makes perfect sense.
In the paper, you explain the GraphQL filtering method as a two step process: local pruning and global refinement. For local pruning, the paper describes the method of using the "profile", i.e. the lexicographical ordering of the neighbor labels. In the implementation, you opt for the NLF filter instead (which degrades to a LDF filter if the
OPTIMIZED_LABELED_GRAPH
is disabled). Is there any particular reason for that decision? If it's mentioned in the paper, I might have missed it. Thanks!