YuxiangRen / Heterogeneous-Deep-Graph-Infomax

HDGI code
59 stars 14 forks source link

IMDB dataset #4

Open bioannidis opened 4 years ago

bioannidis commented 4 years ago

Hi can you provide an example how one tries the code for the IMDB dataset?

YuxiangRen commented 4 years ago

I take codes in DGI-HGCN as an example. The situation is the same for DGI-HGAT, but the # of code lines may be different.

When you need to run the code for the IMDB dataset, the code line #153 of utils/process.py should be modified.

1, change the URL to the IMDB dataset in #153 2, change the URL in line#158 to the initial feature matrix of Movie. It should be a matrix of 4275*6344 in my paper. 3, change the URL in line#171. Here are three adjacency matrix based on different meta-paths. In this paper, we use MAM, MDM, and MKM to construct three adjacency matrices. 4, The size of training, testing and validation set should be modified based on the experimental setting from line#177 to #181

bioannidis commented 4 years ago

Thanks for the response.

The adjacency matrices are not square and I get the following error. adj = process.normalize_adj(adj + sp.eye(adj.shape[0])) ValueError: inconsistent shapes

bioannidis commented 4 years ago

Could you please provide a working version of your code for the IMDB dataset? I want to add your model to a library we are developing. Thank you

YuxiangRen commented 4 years ago

I have checked out the issue. Because the adjacency matrix in data fold is the connection between movie and keyword which is not square. The adjacency matrix should be movie-keyword-movie which can be calculated the adjacency matrix of movie and keyword. I find this matrix is missing in the data fold. I will create such movie-keyword-movie, movie-director-movie,movie-actor-movie matrice as soon as possible. You can also create it by yourself. Sorry for the convenience. I think the other two datasets should work.

bioannidis commented 4 years ago

Yes please update the github repo when you figure out the issue. Thanks for the help.

YuxiangRen commented 4 years ago

I have updated the code. Please check the code in DGI-HGCN and the updated data fold in data/IMDB/3-class.