xiangyue9607 / BioNEV

Graph Embedding Evaluation / Code and Datasets for "Graph Embedding on Biomedical Networks: Methods, Applications, and Evaluations" (Bioinformatics 2020)
MIT License
224 stars 77 forks source link

Hyper-parameters for word2vec #18

Closed LucaCappelletti94 closed 3 years ago

LucaCappelletti94 commented 3 years ago

Hello,

Thank you for this amazing work! In the context of my PhD thesis, I need to run some comparison of your work with some experimental models, and I need some hyper-parameters that I could not find in either the paper or the supplementary materials.

I wanted to ask you which hyper-parameters were used for the different graphs in the skip-gram based models, specifically:

  1. What is the window size for the context? That is, how near the nodes have to be to the central node to be considered contextual during the embedding process. I see that in your code there is 10 as default value but in small connected graphs such as the ones you have considered that would mean that every node is contextual to every other node in the same connected component if the small world hypothesis holds for these graphs.
  2. What loss function has been used? NCE loss? A complete softmax? If the NCE loss has been used, how many negative samples have been used for the process?

Thank you and have a nice day, Luca

xiangyue9607 commented 3 years ago

Hi Luca,

Thanks for your interest.

  1. If we do not explicitly point out the hyperparameters used in the paper/suppl, then they are set as default.

  2. We use the gensim word2vec library to train the skip-gram based models. You can check gensim library documentation for more details. All the parameters are set as default.

Thanks!

Xiang