Open avantgardnerio opened 2 years ago
Edit: cleaner API with .extend(other: RecordBatch)
: https://gist.github.com/avantgardnerio/48d977ea6bd28c790cfb6df09250336d
Like the general idea, just a couple of comments/questions:
Vec<RecordBatch>
?One potentially simpler way to implement something similar to this would be to add a non-consuming finish method to the builders. This would entail copying the buffers, but in my experience of implementing the write path for IOx, this copy is insignificant in the grand scheme of query execution - even ignoring the heavy hitters like sorts and groups, all kernels involve copying values to an output array. This is especially true if after a certain number of rows you rotate the builders and just keep the immutable RecordBatch, thereby bounding the copy. What do you think?
Edit: I created https://github.com/apache/arrow-rs/issues/3154 for this
Fortunately however, append-only and immutable are not necessarily at odds, as long as one takes immutable RecordBatch snapshots at a point-in-time of the append-only version.
I think the general pattern of
RecordBatch
)Makes sense with arrow. In fact this is what we do in IOx on the write path.
This pattern can be done with the various *Builder
s -- such as https://docs.rs/arrow/27.0.0/arrow/array/type.Int64Builder.html
So something like:
builder.append_value(1);
// get an array to use in datafusion, etc
let array = builder.build();
// make a new builder and start building the second batch
builder = Int64Builder::new();
builder.append_value(2);
builder.append_value(3);
I think what @tustvold is suggesting with " non-consuming finish method to the builders" means you could do something like:
builder.append_value(1);
// get an array to use in datafusion, etc (by copying the underlying values)
let array = builder.snapshot();
// existing builder can be used to append
builder.append_value(2);
builder.append_value(3);
I don't fully follow the proposal in https://gist.github.com/avantgardnerio/48d977ea6bd28c790cfb6df09250336d
As soon as you have this function:
pub fn as_slice(&self) -> RecordBatch {
self.record_batch.slice(0, self.len())
}
This means that now multiple locations may be able to read that data so trying to update as well will likely be an exercise in frustration as you try and fight the rust compiler to teach it that somehow you have guaranteed that the memory remains valid and that concurrent read and modification are ok.
Maybe we could start with the "make a snapshot based copy" and if the copying is too much figure out some different approach.
Others might still find it useful, but I've convinced myself with how we are planning to handle MVCC we'll need to make copies anyway, so the builder approach is probably fine (given a non-consuming .finish()
).
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
At Space and Time, we would like to make an OLTP columnar DB based on Arrow (see noisepage for precedent). Unfortunately, Arrow RecordBatches are immutable by design.
Describe the solution you'd like
Fortunately however, append-only and immutable are not necessarily at odds, as long as one takes immutable RecordBatch snapshots at a point-in-time of the append-only version. See this gist for a working PoC implementation:
Describe alternatives you've considered
MutableBuffer
work, vs preallocating ones from the immutable package