THUDM / GATNE

Source code and dataset for KDD 2019 paper "Representation Learning for Attributed Multiplex Heterogeneous Network"
MIT License
525 stars 141 forks source link

About comparative experiments #43

Closed lalw closed 4 years ago

lalw commented 4 years ago

Hello, I want to do comparative experiments on my own dataset, but I didn't find the official python code of metapath2vec, so I'm sorry to ask, what is the metapath2vec code you are using for comparison experiments, can you share it? Thank you very much!

cenyk1230 commented 4 years ago

Hi @lalw,

Sorry for the late reply. We use an internal version of metapath2vec in Alibaba Group. Maybe you can use an unofficial code of metapath2vec or implement it on your own. The most important part of metapath2vec is the heterogeneous random walk, which can be easily implemented.

lalw commented 4 years ago

Thanks for your answer! I still have a problem, for example, performing random walks on the metapath of [1]"Academy-Discipline-Project-Discipline-Academy" to get the embeddings of Academy. In addition to the Academy's embedding, it is also necessary to obtain the Discipline's embedding. The metapath with Discipline as the starting point should be [2]"Discipline-Project-Discipline". Just based on the random walk of [1], is the calculated Discipline's embedding correctly? (my concern is that the discipline is not the starting point of the metapath) If not,when the metapath is [1] and [2],can these two metapaths with different starting points be executed simultaneously in one model? Or should the model be executed twice? Looking forward to your answer, thank you very much!

cenyk1230 commented 4 years ago

The skip-gram method will train the embeddings of all the nodes in the random walks, rather than the embeddings of starting nodes in the metapaths. The Discipline's embeddings trained from random walks of scheme 1 can be directly used for downstream tasks.

lalw commented 4 years ago

Sorry, path [1] is wrong. It should be [1]"Academy-Project-Discipline-Project-Academy" 1.My concern is that this meta-path is not specifically set for Discipline. Can it fully capture semantic and structural information? 2.If there is another type of Node Researcher, which cannot be combined in the [1] or [2] path, can I set the path to [1]"Academy-Project-Discipline-Project-Academy" and [2] "Discipline-Project-Discipline" and [3]"Researcher-Project-Discipline-Project-Researcher" in one run to get all the embedding for all kinds of nodes? I am a beginner, have many questions, thank you for your patience!