YongjiangL / ML-Partition-Search

A Multi-layered Indexing Approach for Similarity Search in Graph Databases
6 stars 7 forks source link

Codes logic about your paper #1

Open tiankonghenlan20113046 opened 4 years ago

tiankonghenlan20113046 commented 4 years ago

Dear code author, @YongjiangL I am a reader of your paper Similarity Search in Graph Databases: A Multi-layered Indexing Approach. I have carefully read your paper. And I am implementing your code. It is so admirable to read your code because your code is rigorous logic. But I am a newer about this direction and I some questions about your code. 1.How do you calculate your "subgraph isomorhism" ? I have read the code, having know that you use "DFS code", but I have also read a c++ file named "Ullman.h".I am tying the console information to know the relationship between them. But I am in a mess. I knew that Ullman algorithm is the basic algorithm to calculate the isormorphism, but I want to make the use of the result of the Ullman algorithm to speedup the "verification time" in A*. Please how can I do that ?

2. About your dataset. I have carelfully read your dataset. In your paper ,you mentioned a dataset named "AIDS" but in your dataset the vertex and edge lable only contain the integer but not string.Do you have the standard format like your dataset on AIDS. I want to do a test on this AIDS dataset.

Looking forward to have a contact with you. Yours sincerely, Woog.

YongjiangL commented 4 years ago

@tiankonghenlan20113046

  1. Ullman.h is the class to check the subgraph isomorhism between 2 graphs. DFS code is a unique representation of graph partitions to ensure that two isomorphic partitions will share one index entry represented by their canonical DFS code. See paper: gSpan: Graph-based substructure pattern mining

  2. We use integer to represent the label of vertex or edge. For example, v 25 23 //denote a vertex which id is 25, label is 23 e 15 25 0 //denote an edge (15,25) which label is 0