Closed luciebaudoin closed 3 years ago
There are examples at http://www.bnosac.be/index.php/blog/100-word2vec-in-r
Thanks for your quick reply! I've been through this code before and that doesn't solve my problem. I understood how to load a pretrained vector, but when I look for predictions, I can only see the predictions from that pre trained vector regardless of my local corpus.
i.e. doing: model <- read.word2vec(file = "/model.bin", normalize = TRUE) and then l1 <- predict(model, newdata = c("word"), type = "nearest", top_n = 10)
This doesn't make the pre-trained model work on my local corpus of text. It gives me the predictions pre-existing in that model.
I'm looking to do something similar to the chrono_train function in word2vec on Python in R and can't figure out how.
Again, thanks for your invaluable help.
What do you mean with chrono_train? Where is that defined?
Sorry if I am not being clear.
I want to model a corpus by initializing with the vectors of a previous model.
This will allow me to do chronologically trained vectors such as in the following article from Emma Rodman, but on R: https://static1.squarespace.com/static/5ca7d04ea09a7e68ba44e707/t/5cda219af4e1fc94236bc0cf/1557799325771/Diachronic_Word_Vectors___Political_Analysis_Final_Version.pdf
On Python, this "chrono_train" function looks like this: def chrono_train(n_iterations, current_corpus, previous_model, output_model): for k in range(n_iterations): sentence_samples = resample(current_corpus) model = Word2Vec.load(previous_model) run = k+1 model.save(output_model)
I hope this is a bit clearer... I'm still new to this to there's a lot of trial and error.
There is no option in this R package to train on your own corpus starting from an initial set of word vectors. This package
Both of these use cases are shown at http://www.bnosac.be/index.php/blog/100-word2vec-in-r If you want to do transfer learning (keep on training starting from an existing set of word vectors), there is functionality of this implemented in R package ruimtehol - see section '5. Transfer learning' at https://cran.r-project.org/web/packages/ruimtehol/vignettes/ground-control-to-ruimtehol.pdf
Thank you very much!
BTW. If you plan to do Procrustes matrix alignment. Feel free to share your code. I'll be interested in this as well.
Hi,
Although you mention it's a possibility, I can't find a clear code on how to use a downloaded pre-trained model on a local corpus of text with the R word2vec package. Can you help me with that? Thank you!