renkun-ken / rlist

A Toolbox for Non-Tabular Data Manipulation
Other
204 stars 28 forks source link

new append benchmark for reference #98

Open timelyportfolio opened 9 years ago

timelyportfolio commented 9 years ago

I'm sure my ignorance will show here, but I thought this was a useful reference benchmark for list.append based on this StackOverflow discussion. Feel free to disregard.

library(rlist)

AddItemEnvir <- function(item)
{

  .GlobalEnv$Counter <- .GlobalEnv$Counter + 1

  .GlobalEnv$Result[[as.character(.GlobalEnv$Counter)]] <- item
}

AddItemDoubling <- function(item)
{
  if( .GlobalEnv$Counter == .GlobalEnv$Size )
  {
    length(.GlobalEnv$Result) <- .GlobalEnv$Size <- .GlobalEnv$Size * 2
  }

  Counter <<- Counter + 1

  .GlobalEnv$Result[[.GlobalEnv$Counter]] <- item
}

# now benchmark
library(microbenchmark)
n = c(1e2,1e3,1e4,1e5)#,1e4,1e5)
benches = lapply(
  n,
  function(ntest){
    microbenchmark(
      env_method = {
        Counter <<- 0
        Result <<- new.env()
        for(i in seq_len(ntest)) {
          AddItemEnvir(i) 
        }
      },
      item_double_method = {
        Counter <<- 0
        Result <<- list(NULL)
        Size <<- 1
        for(i in seq_len(ntest)) {
          AddItemDoubling(i)
        }
      },
      rlist_method1 = {
        Result <<- list()
        for(i in seq_len(ntest)) {
          Result <<- list.append(Result,i)
        }
      },
      rlist_method2 = {
        Result <<- list()
        Result <<- Reduce( function(x,y){ list.append(x, y) }, seq_len(ntest), list(), right=T )
      }
      , times = 1L
    )    
  }
)

with result on my machine

[[1]]
Unit: microseconds
               expr     min      lq    mean  median      uq     max neval
         env_method 955.226 955.226 955.226 955.226 955.226 955.226     1
 item_double_method 448.155 448.155 448.155 448.155 448.155 448.155     1
      rlist_method1 198.039 198.039 198.039 198.039 198.039 198.039     1
      rlist_method2 404.441 404.441 404.441 404.441 404.441 404.441     1

[[2]]
Unit: milliseconds
               expr      min       lq     mean   median       uq      max neval
         env_method 9.912228 9.912228 9.912228 9.912228 9.912228 9.912228     1
 item_double_method 4.188474 4.188474 4.188474 4.188474 4.188474 4.188474     1
      rlist_method1 7.414880 7.414880 7.414880 7.414880 7.414880 7.414880     1
      rlist_method2 8.275458 8.275458 8.275458 8.275458 8.275458 8.275458     1

[[3]]
Unit: milliseconds
               expr       min        lq      mean    median        uq       max neval
         env_method 123.18576 123.18576 123.18576 123.18576 123.18576 123.18576     1
 item_double_method  57.09918  57.09918  57.09918  57.09918  57.09918  57.09918     1
      rlist_method1 795.22280 795.22280 795.22280 795.22280 795.22280 795.22280     1
      rlist_method2 706.21650 706.21650 706.21650 706.21650 706.21650 706.21650     1

[[4]]
Unit: milliseconds
               expr         min          lq        mean      median          uq         max
         env_method   1963.2419   1963.2419   1963.2419   1963.2419   1963.2419   1963.2419
 item_double_method    505.2499    505.2499    505.2499    505.2499    505.2499    505.2499
      rlist_method1 151466.2442 151466.2442 151466.2442 151466.2442 151466.2442 151466.2442
      rlist_method2 149780.5124 149780.5124 149780.5124 149780.5124 149780.5124 149780.5124
 neval
     1
     1
     1
     1
timelyportfolio commented 9 years ago

Actually, just tested these with about the same results as the env_method.

benches = lapply(
  n,
  function(ntest){
    microbenchmark(
      rlist_method3 = {
        Result <<- List()
        for(i in seq_len(ntest)) {
          Result <<- Result$append(i)
        }
      },      
      rlist_method4 = {
        Result <<- List()
        for(i in seq_len(ntest)) {
          .GlobalEnv$Result[[as.character(i)]] <- i
        }
      }
      , times = 1L
    )    
  }
)

These results don't match up with my experience. Might need to use line profiler.


[[1]]
Unit: milliseconds
          expr      min       lq     mean   median       uq      max neval
 rlist_method3 8.592473 8.592473 8.592473 8.592473 8.592473 8.592473     1
 rlist_method4 9.610418 9.610418 9.610418 9.610418 9.610418 9.610418     1

[[2]]
Unit: milliseconds
          expr       min        lq      mean    median        uq       max neval
 rlist_method3  90.75483  90.75483  90.75483  90.75483  90.75483  90.75483     1
 rlist_method4 127.60269 127.60269 127.60269 127.60269 127.60269 127.60269     1

[[3]]
Unit: seconds
          expr      min       lq     mean   median       uq      max neval
 rlist_method3 1.569715 1.569715 1.569715 1.569715 1.569715 1.569715     1
 rlist_method4 4.020317 4.020317 4.020317 4.020317 4.020317 4.020317     1

[[4]]
Unit: seconds
          expr      min       lq     mean   median       uq      max neval
 rlist_method3 202.8104 202.8104 202.8104 202.8104 202.8104 202.8104     1
 rlist_method4 858.1140 858.1140 858.1140 858.1140 858.1140 858.1140     1
renkun-ken commented 9 years ago

Thanks for reporting this! I'll take a look and see if I can improve the existing code.