stanfordnlp / GloVe

Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddings
Apache License 2.0
6.86k stars 1.51k forks source link

Ability to read pre-trained embeddings into R #143

Closed Drew2019 closed 4 years ago

Drew2019 commented 5 years ago

Working in R, I was wondering whether anyone has read in the pre-trained embeddings (e.g., "glove.6B.50d.txt") into that system. I've had zero luck reading this text file into R so that the product is the word embedding matrix of words by vector, and have some unfamiliarity with the txt file formatting. Has anyone successfully done this, either pulling from a saved .txt file or from the site itself, and if so how was that text converted to a matrix in R?

AngledLuffa commented 4 years ago

The text format is just word followed by a list of numbers, in this case 50. I don't know any R but I have to imagine that isn't hard to read.