Open gatesn opened 1 week ago
Generally I wouldn't expect a selection kernel to alter the schema, so I think in this case it should raise an error
Yes that's also reasonable.
It's a bit annoying that Arrow DataTypes don't themselves have a nullable flat, since the selection kernels over non-nested arrays can also introduce nulls to previously non-null arrays.
take
It's a bit annoying that Arrow DataTypes don't themselves have a nullable flat
One way to get this is to use StructArray in place of RecordBatch, this is actually what a lot of the IO logic in arrow-rs does, converting to RecordBatch at the edges.
IMO RecordBatch is confusing and arrow would be better off without it, but it's too late for that now 😅
can i fix
Describe the bug When calling
arrow::compute::take
on a StructArray with non-nullable fields and passing take indices that contain null values, the resulting StructArray still has non-nullable fields. This is an invalid state.Expected behavior The take function should convert all fields to nullable iff the take indices contain any nulls.
https://docs.rs/arrow-select/53.2.0/src/arrow_select/take.rs.html#238-239