dselivanov / text2vec

Fast vectorization, topic modeling, distances and GloVe word embeddings in R.
http://text2vec.org
Other
851 stars 136 forks source link

embedding evaluation #283

Closed jwijffels closed 6 years ago

jwijffels commented 6 years ago

Hi @dselivanov I've been working recently on the https://github.com/bnosac/ruimtehol package which does amongst other things word/sentence/article embeddings (and a bit more) based on the Starspace C++ library. I would like the compare the embeddings which come out of the package to the Glove embeddings. The Starspace paper does this already but I would like to do this myself. As the text2vec package already implements Glove embeddings, do you have - par hasard - a script lingering around which somehow evaluates embeddings generated in different runs/toolsets?

dselivanov commented 6 years ago

Hi. Unfortunately not.

пт, 21 сент. 2018 г., 1:16 jwijffels notifications@github.com:

Hi @dselivanov https://github.com/dselivanov I've been working recently on the https://github.com/bnosac/ruimtehol package which does amongst other things word/sentence/article embeddings (and a bit more) based on the Starspace C++ library. I would like the compare the embeddings which come out of the package to the Glove embeddings. The Starspace paper does this already but I would like to do this myself. As the text2vec package already implements Glove embeddings, do you have - par hasard - a script lingering around which somehow evaluates embeddings generated in different runs/toolsets?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dselivanov/text2vec/issues/283, or mute the thread https://github.com/notifications/unsubscribe-auth/AE4u3e4-EoZcFmeCQdgsACdjDiMyhUaSks5udAWngaJpZM4Wy9av .

KafeelBasha commented 5 years ago

With reference to text2vec documentation and below link. https://cran.r-project.org/web/packages/text2vec/vignettes/glove.html#word_embeddings

I have created a word vector with dimension (10000,100), but in order to work with Keras, it requires sequence length. Have tried using Pre traind word vectors like glove.6B.100d.txt, but it takes long time to execute, and RStudio terminates abruptly. I am working with 8GB RAM machine.

Is there a way to use the word vector created using text2vec inside keras embedding layer?.