Closed hitercs closed 4 years ago
Hi,
Many tools can be used to get the corpus from Wikipedia. I suggest: https://github.com/attardi/wikiextractor
For word2vec model training with gensim, please see https://radimrehurek.com/gensim/models/word2vec.html
Hope the above information is helpful.
Thanks.
Hi,
Thanks for your reply. Ya. I can train it by myself using above tools. But it will be great if you can share the embedding file used in your experiments so as to remove the running difference on this.
Thanks a lot.
Hi,
Maybe you can try this one trained by me: https://drive.google.com/open?id=1d_xrUPRLQjpcZrlm_cpKJO3jWBFKYhcl
Thanks.
Hi,
Thanks for your work. I saw the code in AAAI19/exp_Limaye/train_cnn.py used a word2vec model called ~/w2v_model/enwiki_model/word2vec_gensim. But I don't find it in this repo. As seen from the paper, it is trained with the latest Wikipedia dump. Could you please share this file, so we can run this code.
Thanks a lot.