When I run pagoda.pathway.wPCA() multiple times on the same input I get different results, even if I set the seed before each run of this function. How can I ensure this function produces the same results with the same dataset?
R code to test on the tutorial data:
library("scde")
library(org.Hs.eg.db)
data(pollen)
cd <- clean.counts(pollen)
x <- gsub("^Hi_(.*)_.*", "\\1", colnames(cd))
l2cols <- c("coral4", "olivedrab3", "skyblue2", "slateblue3")[as.integer(factor(x, levels = c("NPC", "GW16", "GW21", "GW21+3")))]
data(knn)
varinfo <- pagoda.varnorm(knn, counts = cd, trim = 3/ncol(cd), max.adj.var = 5, n.cores = 1, plot = TRUE)
varinfo <- pagoda.subtract.aspect(varinfo, colSums(cd[, rownames(knn)]>0))
# translate gene names to ids
ids <- unlist(lapply(mget(rownames(cd), org.Hs.egALIAS2EG, ifnotfound = NA), function(x) x[1]))
rids <- names(ids); names(rids) <- ids
# convert GO lists from ids to gene names
gos.interest <- unique(c(ls(org.Hs.egGO2ALLEGS)[1:100],"GO:0022008","GO:0048699", "GO:0000280", "GO:0007067"))
go.env <- lapply(mget(gos.interest, org.Hs.egGO2ALLEGS), function(x) as.character(na.omit(rids[x])))
go.env <- clean.gos(go.env) # remove GOs with too few or too many genes
go.env <- list2env(go.env) # convert to an environment
# test without seed
pwpca1 <- pagoda.pathway.wPCA(varinfo, go.env, n.components = 1, n.cores = 35)
pwpca2 <- pagoda.pathway.wPCA(varinfo, go.env, n.components = 1, n.cores = 35)
ae_noseed<-all.equal(pwpca1,pwpca2)
if( isTRUE(ae_noseed)) {
print("No seed: pwpca1 pwpca2 are equal")
} else {
print("No seed: pwpca1 pwpca2 are not equal")
}
# test with seed
set.seed(0)
pwpca3 <- pagoda.pathway.wPCA(varinfo, go.env, n.components = 1, n.cores = 35)
set.seed(0)
pwpca4 <- pagoda.pathway.wPCA(varinfo, go.env, n.components = 1, n.cores = 35)
ae_seed<-all.equal(pwpca3,pwpca4)
if( isTRUE(ae_seed) ) {
print("With seed: pwpca3 pwpca4 are equal")
} else {
print("With seed: pwpca3 pwpca4 are not equal")
}
Output:
> source("test.R")
[1] "No seed: pwpca1 pwpca2 are not equal"
[1] "With seed: pwpca3 pwpca4 are not equal"
Single example of differences in results:
> summary(pwpca1$"GO:0048699"$z)
V1
Min. :6.684
1st Qu.:6.769
Median :6.945
Mean :6.979
3rd Qu.:7.174
Max. :7.390
> summary(pwpca2$"GO:0048699"$z)
V1
Min. :6.804
1st Qu.:6.992
Median :7.123
Mean :7.134
3rd Qu.:7.218
Max. :7.499
> summary(pwpca3$"GO:0048699"$z)
V1
Min. :6.864
1st Qu.:6.954
Median :7.038
Mean :7.137
3rd Qu.:7.258
Max. :7.764
> summary(pwpca4$"GO:0048699"$z)
V1
Min. :6.797
1st Qu.:6.965
Median :7.056
Mean :7.077
3rd Qu.:7.113
Max. :7.516
When I run pagoda.pathway.wPCA() multiple times on the same input I get different results, even if I set the seed before each run of this function. How can I ensure this function produces the same results with the same dataset?
R code to test on the tutorial data:
Output:
Single example of differences in results:
sessionInfo()