aatkinson / deep-named-entity-recognition

Use RNNs to identify entities in news queries
56 stars 21 forks source link

word2vec file #3

Open deepanshuagarwal150 opened 6 years ago

deepanshuagarwal150 commented 6 years ago

how did you create the word2vec file?

v0idwalker commented 6 years ago

(I am not the author) My guess would be, that he got the representation by the word2vec pretrained 300d wordvectors. You can download it (or google it): https://github.com/mmihaltz/word2vec-GoogleNews-vectors/blob/master/GoogleNews-vectors-negative300.bin.gz

then I use (in pyhton)

import gensim
w2v = gensim.models.KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True)

afterwards, you can ask for a pretrained wordvector by:

wv_man = w2v.wv['man'] (which will return the w2v pretrained 300d wordvector)

And it can do so much more. (see gensim) Please, be advised, that loading this binary blob required (at least in my case) 7.8GB of ram, just for this one task.