Closed hutaohutc closed 6 years ago
Hi. Results will be reproducible only if you use 1 thread. Otherwise they will be not reproducible since fitting is done via async SGD without locks (with race conditions).
Thank you for you answer~ but when I use n_threads = 1
in fit_transform
function ,I still can not reproduce the result. There is my code :
set.seed(42)
wv_main = glove$fit_transform(tcm, n_iter = 50, convergence_tol = 0.01,n_threads = 1)
Actually it seems n_threads
has no effect (I've missed to set number of threads equal to n_threads
). You can call RcppParallel::setThreadOptions(1)
before initializing model:
data("movie_review")
library(text2vec)
it = itoken(movie_review$review, tolower, word_tokenizer)
v = create_vocabulary(it)
v = prune_vocabulary(v, term_count_min = 20)
tcm = create_tcm(it, vocab_vectorizer(v))
RcppParallel::setThreadOptions(1)
set.seed(42)
gl = GloVe$new(word_vectors_size = 50, x_max = 10, vocabulary = v, shuffle = F)
temp1 = gl$fit_transform(tcm, n_iter = 2, n_threads = 1)
set.seed(42)
gl = GloVe$new(word_vectors_size = 50, x_max = 10, vocabulary = v, shuffle = F)
temp2 = gl$fit_transform(tcm, n_iter = 2, n_threads = 1)
identical(temp1, temp2)
# TRUE
Hi, I could still not reproduce the vectors, although specifying the number of threads with RcppParallel::setThreadOptions(1) and the seed as recommended. Identical(temp1,temp2) still returns FALSE.
Updated example above - call set.seed(42)
before each model initialization.
I try
set.seed()
in R,but it failed. I can not reproduce the result . Would you please tell me how to reproduce the result.