stanfordnlp / GloVe

Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddings
Apache License 2.0
6.86k stars 1.51k forks source link

where to get the words of pre-trained? #126

Open githubg0 opened 6 years ago

githubg0 commented 6 years ago

hello,I'm new to GloVe. I download the "Common Crawl (840B tokens, 2.2M vocab, cased, 300d vectors, 2.03 GB download): glove.840B.300d.zip", and unzip the file, I found there is no vocabularies in that file. can anyone can tell me how to get the vocabularies which is pair to the file I download.

akanshajainn commented 6 years ago

using glove-python you can load glove txt file can get vocab,vectors mos similar keywords to a function and other things.

from glove import Glove
obj = Glove.load_stanford(glovefilepath)
print(obj.vocab)