randy3k / collections

High-performance container datatypes for R
https://randy3k.github.io/collections
Other
103 stars 3 forks source link

[feature request] List class #4

Open dselivanov opened 5 years ago

dselivanov commented 5 years ago

I think it may make sense to create a wrapper around R's list with clear append semantics without copy of the object. Recently R lists (and actually arrays) seems improved a lot in a sense that they don't make copies with each append. But this is not widely known.

N = 1e6
dynamic = function(N) {
  lst = vector('list', 0)
  for(i in 1:N)  lst[[i]] = TRUE
}
preallocated = function(N){
  lst_preallocated = vector('list', N)
  for(i in 1:N)  lst_preallocated[[i]] = TRUE
}

microbenchmark::microbenchmark( dynamic(N), preallocated(N), times = 10)
Unit: milliseconds
            expr       min        lq      mean   median        uq       max neval
      dynamic(N) 164.18238 168.08486 178.70834 172.7364 177.34840 232.65446    10
 preallocated(N)  42.41742  42.73463  44.25383  44.2013  44.35719  47.54679    10

As can be seen it is still better to pre-allocate result, but dynamic resizing is not that bad now (few re-allocations).

randy3k commented 5 years ago

In PriorityQueue, the memory is first pre-allocated and the memory is reallocated and doubled if we hit the memory cap. While it's a bit memory inefficient, but we trade memory for speed. We may also implement this expanding memory idea to List.

https://github.com/randy3k/collections/blob/master/src/priorityqueue.c#L18