HKUST-KnowComp / FMG

KDD17_FMG
138 stars 55 forks source link

An issue about code:cal_rar_block() #7

Closed NingMa-AI closed 6 years ago

NingMa-AI commented 6 years ago

Hi~,I don`t understand what this function do in the code file '200k_commu_mat_computation.py'. Could you give me a tip or explain it?Thank you very much!

hzhaoaf commented 6 years ago

@MaNingChina As said in the instruction, the file "200k_commu_mat_computation.py'" is used to compute all meta-graph based similarity matrices. In our KDD paper, we design 9 meta-graphs on Yelp dataset, and there is an function named "cal_yelp_all". Each meta-graph based similarity matrix is computed accordingly.

For "cal_rarblock" function, it is used to compute the subpath "R--> A <-- R" in blocks and saved the result in the disk in advance because of the lack of memory space when we want to compute it directly with "**W{RA} \cdot W_{RA}^T**". (_That's why this function is called in Line 459 before the execution of cal_comm_matUSUB)

Generally speaking, the function "cal_rarblock" is used to get the resulting matrix of "W{RA} \cdot W_{RA}^T". It consist of the following key points:

  1. Split the adjacency matrix "W_{RA}" into a number of small blocks (10000 * 10 in our case, controlled by the parameter "step").
  2. Compute the dot production of these small blocks of submatrices.
  3. Due to the large density of "W{RA}", the resulting matrix of "W{RA} \cdot W{RA}^T" is quite dense, which is infeasible for storage and subsequent computing. Then for each row of the resulting matrix, denoted as "W{RAR}", we only preserve the largest 100 entries. (Controlled by the parameter "topK".)

If you want to know more detail about this function, you may write a test function yourself, and just set the "DEBUG" variable to True at Line 351. I wrote a test case for a small matrix in this function.

NingMa-AI commented 6 years ago

Thank you! I known how to test the function cause your answer!