Open bjchambers opened 11 months ago
Arrow recently added Scalar
which simplifies many of the APIs handling scalar values. We could potentially use this to replace our ScalarValue
enum.
See https://github.com/apache/arrow-rs/pull/4793/files for some notes:
Int64Array::new_scalar(...)
to simplify creating a scalar of a specific type.Scalar<ArrayRef>
for using an array as a scalar.This should let us create something like:
struct ScalarValue(Scalar<ArrayRef>);
// 1. Serde the underlying ArrayRef
// 2. Support use as a scalar anywhere
@jordanrfrazier FYI -- I think we should be able to completely replace our ScalarValue enum with Scalar<ArrayRef>
. I think it's probably worth doing that for how we serialize scalars in the physical plan
.
Thoughts on:
A) Trying to change the ScalarValue
to use that representation (potentially messy since we have that protobuf in many places0.
B) Introducing a parallel Scalar???
to represent that?
C) Just using arrow_array::Scalar<arrow_array::ArrayRef>
in those places?
I think with C
we'd need to do the custom serialization like we did for structs with an ArrayRef
, while with A
or B
we could encapsulate that in our serialization of the wrapper type we create.
_Originally posted by @jordanrfrazier in https://github.com/kaskada-ai/kaskada/pull/784#discussion_r1342918969_