Open nemosmithasf opened 2 years ago
I did some benchmarking and it seems changing the default array size significant deteriorates the performance.
That's really interesting. I couldn't figure out why from looking at the source. For my own edification, could you explain why this is? Or perhaps point to some part of the code that I can look at in closer detail to figure it out?
Can you share how / what you benchmarked to come to that conclusion? I am also currently looking at possibly improving the performance of this library in certain situations. Especially the actual growth operation is currently pretty inefficient and I have seen quite significant improvements in situations where it needs to grow a lot
I just copied the code into my codebase and changed 4
to a value that is closer to what I anticipated my average queue size to be.
I then ran a benchmark on my codebase and compared having 4
and the new value as the default; and found that it was slower.
That said, my benchmark wasn't especially scientific, so don't treat it as authoritative- I'd definitely do independent testing if I were you.
Only changing the new Array(4)
will not work since the wrap-around is not calculated by doing index % this._list.length
, but by index & this._capacityMask
.
This means that the capacity must always be a multiple of 2 for the wrapping to work and the _capacityMask
must be also set to the matching value. The _capacityMask
is always capacity - 1
, which creates a binary mask that has 1
at all bit positions up to the capacity (for example cap = 16 = 0b10000
and mask = 15 = 0b01111
).
For testing you could try to set the capacity to the next larger power of 2, e.g. new Array(1024)
and the mask to 1 less this._capacityMask = 1023
EDIT: In this text when I say "capacity" I refer to the Array size (this._list.length
/ new Array(N)
) and not the this._capacity
variable. this._capacity
is something different and not directly related to this conversation.
I did a small test implementation of this with a matching benchmark and it seems to work. Tested at 1_000_000, 1_000 and 100 element insertions it is consistently faster when you know the capacity before.
Platform info:
Windows_NT 10.0.19044 x64
Node.JS 18.5.0
V8 10.2.154.4-node.8
AMD Ryzen 7 2700X Eight-Core Processor × 16
Denque (1000000) default capacity x 87.24 ops/sec ±1.67% (73 runs sampled)
Denque (1000000) preallocated capacity x 138 ops/sec ±0.58% (77 runs sampled)
Denque (1000) default capacity x 144,139 ops/sec ±2.03% (87 runs sampled)
Denque (1000) preallocated capacity x 225,468 ops/sec ±0.48% (92 runs sampled)
Denque (100) default capacity x 700,648 ops/sec ±0.36% (95 runs sampled)
Denque (100) preallocated capacity x 1,785,077 ops/sec ±0.61% (95 runs sampled)
DoubleEndedQueue (1000000) default capacity x 32.08 ops/sec ±2.08% (56 runs sampled)
DoubleEndedQueue (1000000) preallocated capacity x 30.67 ops/sec ±2.00% (55 runs sampled)
Here is the code. To run the benchmark do node benchmark/initialCapacity.js
One question about the possible implementation is how the initial capacity should be specified. In the example I made the first constructor arg be either array or number.
Only changing the
new Array(4)
will not work since the wrap-around is not calculated by doingindex % this._list.length
, but byindex & this._capacityMask
. This means that the capacity must always be a multiple of 2 for the wrapping to work and the_capacityMask
must be also set to the matching value. The_capacityMask
is alwayscapacity - 1
, which creates a binary mask that has1
at all bit positions up to the capacity (for examplecap = 16 = 0b10000
andmask = 15 = 0b01111
).
Thanks for this explanation, though I'd be lying if I said I understood it all. Still, it seems like a fascinating technique, definitely something I'll look into.
One question about the possible implementation is how the initial capacity should be specified. In the example I made the first constructor arg be either array or number.
Wouldn't it made sense to just add a property to the options
object at arg_1?
Wouldn't it made sense to just add a property to the
options
object at arg_1?
Yeah that would probably be the other option. The problem I could see with that is that you need to specify both args in order to use the options
object. So that means you would have to give an empty array as arg1 which makes the interface kinda klunky. Also the options
argument is currently not even documented, not sure what is up with that
you need to specify both args in order to use the
options
object.
You can solve it by overloading, since arg_0 can never be an object.
Please correct me if I'm wrong, but wouldn't it be more performant if we can specify an initial array capacity rather than rely on the default size of
4
? Especially if we are using it as a circular buffer...I assume this would save us a few "grow array" operations?