cemoody / lda2vec

MIT License
3.15k stars 629 forks source link

File Missing: 'GoogleNews-vectors-negative300.bin' #90

Closed ghost closed 5 years ago

ghost commented 5 years ago

In twenty_newsgroups/data, I can't run preprocess.py since a file is missing. It's stuck at the following code block:

# Fill in the pretrained word vectors
n_dim = 300
fn_wordvc = 'GoogleNews-vectors-negative300.bin'
vectors, s, f = corpus.compact_word_vectors(vocab, filename=fn_wordvc)
whcjimmy commented 5 years ago

This is the offical website for that file. https://code.google.com/archive/p/word2vec/ You can download the file from "Pre-trained word and phrase vectors" section.