Closed mflatt closed 6 months ago
One potential issue here is that string-copy!
and bytevector-copy!
accept a start position and count, instead of a start position and end position. Those functions also take separate source and destination starting positions, though, and a length makes more sense in that context. This vector-copy
accepts arguments more in line with substring
, but taking a count like string-copy!
would also make sense.
By avoiding write barriers and redundant initialization, this vector-copy
makes the following example run about twice as fast on my machine, or about 3 times as fast in unsafe mode:
(let ([vec '#(apple banana coconut)])
(let loop ([i 1000000] [r #f])
(if (= i 0)
r
(loop (fx- i 1) (vector-copy vec)))))
There are several problems with the current implementation. I'll try again.
The newest commit implements vector-copy
in a better way, and it adds vector-append
.
Overall, the intent is to provide better primitives for implementing a persistent data structure like an RRB tree. Those kinds of data structures need to frequently copy and update a vector, sometimes adding or removing slots. The vector-append
operation here is specialized for up to 3 vectors with those kinds of operations in mind. Stencil vectors serve a similar role, but plain vectors can be easier to use when mask bits are not needed.
After trying this out more, I think it's worth adding vector-update
. The performance benefit of vector-update
compared to vector-copy
plus vector-set!
is very small, but it rounds out the available functional operations for the things that I have tried.
One potential issue here is that
string-copy!
andbytevector-copy!
accept a start position and count, instead of a start position and end position. Those functions also take separate source and destination starting positions, though, and a length makes more sense in that context. Thisvector-copy
accepts arguments more in line withsubstring
, but taking a count likestring-copy!
would also make sense.
I'd lean towards making it like string-copy!
due to the similarity in the names. (That also avoids any confusion about whether "end" is inclusive or exclusive.)
I can change vector-copy
to take a length as the third argument.
Any opinion on the name vector-update
? As brought up in racket/racket#4879, it's not quite "update" as used for a function like hashtable-update!
, where it means a function converts the current value at a given key to another value. Alternate suggestions so far are vector-copy-set
and vector-set/copy
, and let's add vector-set-copy
also on the grounds that the function is more like vector-set!
than vector-copy
.
(I picked vector-update
based on stencil-vector-update
, but the connection is weak. I don't think stencil-vector-update
could change in Racket, since its been in several releases, but we could pick different name for it in Chez Scheme for stencil-vector-update
the terminology collision seems bad.)
no, not really. Part of me is tempted to say it's just vector-set
without the "bang", but I'm certain that would cause confusion with new users or make for hard to debug typos. Something like vector-set/copy
or vector-copy-and-set
is a bit unwieldy but does convey what's going on. The /copy
name seems more scheme-like to me, but that might just be that I've spent too long in languages with impoverished identifier syntax that don't allow /
in names.
Thanks! The latest version uses the name vector-set/copy
.
Latest version has specializations for (vector-copy (vector x) y)
and (vector-copy x (vector y))
in unsafe mode, because adding one element to the front or end of a vector is a common operation. There's no end to special cases that could be recognized, but this one gives a 15% performance boost on microbenchmaks for our RRB tree implementation.
Make
vector-copy
likesubstring
when start and end positions are supplied. Internally, allocate the vector without initializing memory, since the content is immediately replaced by copying, and avoid using write barriers, since the vector copy is freshly allocated.