Open FallingSnow opened 1 year ago
We can in some capacity. The problem is that we need to copy part of a TypedArray
to part of a TypedArray
, since it's a circular buffer.
set
takes two parameters, the input array, and an offset to write the data at in the destination array. In particular, it doesn't take an input offset, or an input size, or that kind of thing.
The way to do this in JavaScript is to take a DataView
on the TypedArray
. Unfortunately, this isn't acceptable for this data structure, because it means that objects will be created, and this causes garbage collections. The objects will be extremely short-lived, and modern JS engines have generational GCs, but it's important to have exactly no object created.
This is not enough to express all what's needed for this ring buffer:
We can make it so set(...)
is used when possible.
When running the test suite (which is admittedly a bit contrived, with workloads that aren't that realistic), this fast-path can be taken in about 28% of the _copy(...)
calls.
We can also add two new methods that always use set
, and create DataView
, and that can be used on the side of the ring buffer that doesn't need to be real-time, this might be useful.
If anybody can try this PR on a "real" workload and can report results, I'd be happy to have a look at the data.
Ah that makes sense. While .subarray()
is copy free it is not GC free, and I forgot about the GC requirements of this library.
I believe one way around this is to not make wrapping pushes that would require a 2 separate pushes onto the ringbuffer. Instead an artificial end pointer would be set and we would write to the beginning of the ringbuffer (given there is enough room). Then the artificial end pointer could be cleared after the read pointer passes it. This would create some "wasted" space at the end of the ringbuffer but given the 4x - 15x copy time improvement it may be worth if for some fantasy work load I can't think of right now lol.
However this is most likely out of the scope of this library and I'm not sure if the complexity is worth it, but is interesting to think about.
Thank you for your "fast path" additions.
Set is much faster, even on small arrays.
Code in question: https://github.com/padenot/ringbuf.js/blob/016d963826a63aa715ae0e2d6ccd34f44c093c11/js/ringbuf.js#L346-L350
Benchmark:
input
is 10,000 F32, andsubInput
is 100 F32.