acorg / Racmacs

Racmacs R package for performing antigenic cartography
https://acorg.github.io/Racmacs/
GNU Affero General Public License v3.0
20 stars 9 forks source link

allMapStresses is slow with large map objects #120

Open drserajames opened 2 years ago

drserajames commented 2 years ago

I have a map with about 2000 antigens, 50 sera and 1000 optimisations. allMapStresses(map) is much slower than sapply(map$optimisations,"[[", "stress"). I think this is an example where the use of C++ slows calculations down. I was wondering if there's any downside to using the quicker method?

The code below (I tested it on big single map and a merge - no big difference). The ace file (for the 2nd map, forgot to save the first): merge_50_2000.ace.zip

> library(Racmacs)
> packageVersion("Racmacs")
[1] ‘1.1.35’
> set.seed(850909)
> dat <- acmap(matrix(10*2^round(10*runif(50*2000)), ncol=50, nrow=2000), 
+              ag_names=paste0("A", 1:2000), 
+              sr_names=paste0("S", 1:50))
> map <- optimizeMap(dat, number_of_dimensions = 2, number_of_optimizations = 1000)
Performing 1000 optimizations
============================================================================================================
Optimization runs complete
Took 6.97 mins

Warning messages:
1: In doTryCatch(return(expr), name, parentenv, handler) :
  restarting interrupted promise evaluation
2: In optimizeMap(dat, number_of_dimensions = 2, number_of_optimizations = 1000) :
  There is some variation (12.15 AU for one point) in the top runs, this may be an indication that more optimization runs could help achieve a better optimum. If this still fails to help see ?unstableMaps for further possible causes.
> system.time(sapply(map$optimisations,"[[", "stress"))
   user  system elapsed 
      0       0       0 
> system.time(allMapStresses(map))
   user  system elapsed 
 26.013   0.435  26.445 
> dats <- NULL
> for (i in 1:100){
+ dats[[i]] <- acmap(matrix(10*2^round(10*runif(50*200)), ncol=10, nrow=20), 
+                    ag_names=sample(paste0("A", 1:2000),20), 
+                    sr_names=sample(paste0("S", 1:50),10))
+ }
> merge_dats <- mergeMaps(dats, merge_options=list(sd_limit = 4))
> map <- optimizeMap(dat, number_of_dimensions = 2, number_of_optimizations = 1000)
Performing 1000 optimizations
============================================================================================================
Optimization runs complete
Took 6.98 mins

Warning message:
In optimizeMap(dat, number_of_dimensions = 2, number_of_optimizations = 1000) :
  There is some variation (11.52 AU for one point) in the top runs, this may be an indication that more optimization runs could help achieve a better optimum. If this still fails to help see ?unstableMaps for further possible causes.
> system.time(allMapStresses(map))
   user  system elapsed 
 26.331   0.449  26.785 
> system.time(sapply(map$optimisations,"[[", "stress"))
   user  system elapsed 
  0.000   0.000   0.001 
> set.seed(850909)
> dat <- acmap(matrix(10*2^round(10*runif(5*20)), ncol=5, nrow=20), 
+              ag_names=paste0("A", 1:20), 
+              sr_names=paste0("S", 1:5))
> map <- optimizeMap(dat, number_of_dimensions = 2, number_of_optimizations = 1000)
Performing 1000 optimizations
============================================================================================================
Optimization runs complete
Took 0.56 secs

> system.time(allMapStresses(map))
   user  system elapsed 
  0.353   0.002   0.354 
> system.time(sapply(map$optimisations,"[[", "stress"))
   user  system elapsed 
  0.001   0.001   0.002