CWTSLeiden / networkanalysis

Java package that provides data structures and algorithms for network analysis.
MIT License
145 stars 33 forks source link

Seed doesn't affect the results #24

Closed PetrTsurinov closed 1 year ago

PetrTsurinov commented 1 year ago

Tried to use seed option, but results every run are slightly different. Could you please help me?

all_knn <- RcppHNSW::hnsw_knn(expression_scaled, k = k, distance = 'l2',
                               n_threads = n_threads)
ind <- all_knn$idx

# Parallel Jaccard metric
links <- FastPG::rcpp_parallel_jce(ind)

links <- FastPG::dedup_links(links)
links[,1] <- links[,1] - 1
links[,2] <- links[,2] - 1

jar_path <- "networkanalysis-1.1.0.jar"
network_path <- paste0(tempdir(), "/network.txt")
clusters_path <- paste0(tempdir(), "/clusters.txt")
withr::with_options(c(scipen = 10), write.table(links, network_path, row.names = FALSE, col.names = FALSE, sep = "\t"))
system(paste("java  -Xmx30g -cp", jar_path, "nl.cwts.networkanalysis.run.RunNetworkClustering -q Modularity --seed 5024 --weighted-edges -o", clusters_path, network_path))
vtraag commented 1 year ago

It seems you are running Java from within an R environment. Could you please try to reproduce the problem from the terminal/command line directly? If you are able to reproduce the problem in that environment, could you then please the self-contained code that allows us to reproduce the exact same problem?

PetrTsurinov commented 1 year ago

Thanks for recommendation! Indeed problem seems to be earlier, on FastPG steps, I can see slight difference in network file. "set.seed(seed_number)" doesn't help, need to look deeper into used functions.