RGLab / Rtsne.multicore

R wrapper for Multicore t-SNE
Other
29 stars 3 forks source link

Results not reproducible #4

Open mattmotoki opened 7 years ago

mattmotoki commented 7 years ago

Hi, The documentation suggests that reproducible results can be achieved by setting the seed in R. This works for the original Rtsne function, but it doesn't seem to work for Rtsne.multicore.

library(Rtsne.multicore)

iris_unique <- unique(iris)
mat <- as.matrix(iris_unique[,1:4])

# repeat calculation
set.seed(42)
tsne_out1 <- Rtsne.multicore(mat)

set.seed(42)
tsne_out2 <- Rtsne.multicore(mat)

# plot results
plot(tsne_out1$Y, col=iris_unique$Species, main="first run")
plot(tsne_out2$Y, col=iris_unique$Species, main="second run")

first run second run

gfinak commented 7 years ago

I suggest posting this issue to the original author of the multicore implementation.

On Thu, Jun 15, 2017, 20:37 mattmotoki notifications@github.com wrote:

Hi, The documentation suggests that reproducible results can be achieved by setting the seed in R. This works for the original Rtsne function, but it doesn't seem to work for Rtsne.multicore.

library(Rtsne.multicore)

iris_unique <- unique(iris) mat <- as.matrix(iris_unique[,1:4])

repeat calculation

set.seed(42) tsne_out1 <- Rtsne.multicore(mat)

set.seed(42) tsne_out2 <- Rtsne.multicore(mat)

plot results

plot(tsne_out1$Y, col=iris_unique$Species, main="first run") plot(tsne_out2$Y, col=iris_unique$Species, main="second run")

[image: first run] https://user-images.githubusercontent.com/13989564/27210395-28554000-5207-11e7-9bc8-b4a5203212b8.png [image: second run] https://user-images.githubusercontent.com/13989564/27210396-28554f32-5207-11e7-98f7-5e99fab4ae67.png

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/RGLab/Rtsne.multicore/issues/4, or mute the thread https://github.com/notifications/unsubscribe-auth/ABUkeQjAXRCaXiA4xUXKtbzp2y_BfRjjks5sEfPtgaJpZM4N7-mF .

mattmotoki commented 7 years ago

Thanks for the suggestion.

gfinak commented 7 years ago

Perhaps I spoke too soon, I believe we can fix the seed on this implementation.

mattmotoki commented 7 years ago

Okay great. It's not a big issue for me, but thank you for the useful package.

gfinak commented 7 years ago

Need to initialize the random number generator with a different, but deterministic seed via srand() inside the #pragma omp parallel block of tsne.cpp. Ideally we'd like to either pass in the seed via the R function wrapper or use one of R's random generators so that set.seed calls in R have the expected effect.