reproducing the scores - Githubissues

eXascaleInfolab / JUST

45 stars 7 forks source link

reproducing the scores #4

Closed amblee0306 closed 4 years ago

amblee0306 commented 4 years ago

Hi there,

The paper mentioned that it is possible to achieve ~89% F1 scores for DBLP dataset but it doesn't seem to be achievable. I m tested using alpha 0.3 and 0.5, m=3, window size 6 and 10. All of the combination gave approximately 82% F1-scores. Can I know where else can I tune it?

Thanks!

ranahussein commented 4 years ago

Hello, here are the parameters using 80% training data: embedding dimensions: 128 random walk length: 100, number of walks: 10, window size: 10 alpha: 0.5, m: 3

ranahussein commented 3 years ago

Hello, a quick comment because Gensim and other libraries are updated and some functions are deprecated: When calling Word2Vec, please make sure that the training algorithm is skip gram, as the default in some library versions is cbow.