benedekrozemberczki / karateclub

Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)
https://karateclub.readthedocs.io
GNU General Public License v3.0
2.17k stars 247 forks source link

Node2Vec implementation not the same as authors' implementation #124

Closed wendywangwwt closed 1 year ago

wendywangwwt commented 1 year ago

Hi -

I noticed some differences between your implementation and the authors' implementation (https://github.com/aditya-grover/node2vec). Mainly the differences are the hyper-parameters used in Word2Vec: param authors' karateclub
hs - (gensim default is 0) 1
sg 1 - (gensim default is 0)
learning rate - (gensim default is 0.025) 0.05, configurable
window size 10, configurable 5, configurable
workers 8, configurable 4, configurable (this one doesn't matter much)

It could happen that the gensim recently changed the default values (https://radimrehurek.com/gensim/models/word2vec.html#gensim.models.word2vec.Word2Vec) though, as I didn't trace back to the historical doc.

Another part is the support for weighted graphs. From karateclub's source code, my understanding is that the implemented node2vec method doesn't support edge weights, while the authors' implementation does.

Can you please shed some light on why these are changed? The different setting of hs and sg are most confusing to me, as they likely have a large impact on the resulting embeddings.

Thank you!

benedekrozemberczki commented 1 year ago

The default hyperparameters are different. The model is the same. Have a nice day!