aditya-grover / node2vec

http://snap.stanford.edu/node2vec/
MIT License
2.61k stars 912 forks source link

How to do negative sampling for skip-gram in data processing of spark version? #64

Closed formath closed 5 years ago

DozenCoder commented 5 years ago

spark word2vec do not support negative sampling, only implemented hierarchical softmax. https://spark.apache.org/docs/2.3.2/mllib-feature-extraction.html#word2vec

formath commented 5 years ago

Thanks. After random walk, we can use skip gram to generate pair <src_node, positive_node>. How to sample negative nodes for each pair in data processing phase while not in the training phase.