Hyper-parameters for word2vec

Hello,

Thank you for this amazing work! In the context of my PhD thesis, I need to run some comparison of your work with some experimental models, and I need some hyper-parameters that I could not find in either the paper or the supplementary materials.

I wanted to ask you which hyper-parameters were used for the different graphs in the skip-gram based models, specifically:

What is the window size for the context? That is, how near the nodes have to be to the central node to be considered contextual during the embedding process. I see that in your code there is 10 as default value but in small connected graphs such as the ones you have considered that would mean that every node is contextual to every other node in the same connected component if the small world hypothesis holds for these graphs.
What loss function has been used? NCE loss? A complete softmax? If the NCE loss has been used, how many negative samples have been used for the process?

Thank you and have a nice day, Luca

xiangyue9607 / BioNEV

Hyper-parameters for word2vec #18