Hi Ruidan,
Thanks for your great work. I have a question of execution order of word2vec.py and preprocess.py if I use other data. As mentioned in README, word2vec should be run first, but after reading the code I found that preprocess reads raw text in dataset folder and generates output to preprocessed_data folder, then word2vec read preprocessed data and generate word embeddings. I wonder whether it is correct to run preprocess first to clean the data and then run word2vec to generate embeddings. Looking forward to your reply. Thanks!
Hi Ruidan, Thanks for your great work. I have a question of execution order of
word2vec.py
andpreprocess.py
if I use other data. As mentioned in README,word2vec
should be run first, but after reading the code I found thatpreprocess
reads raw text indataset
folder and generates output topreprocessed_data
folder, thenword2vec
read preprocessed data and generate word embeddings. I wonder whether it is correct to runpreprocess
first to clean the data and then runword2vec
to generate embeddings. Looking forward to your reply. Thanks!