Open ZhangHuiGui opened 1 week ago
cc @pitrou @felipecrv
So, the concern is that the validity bitmap may be NULL if the executor determines that the output is all-valid, right?
So, the concern is that the validity bitmap may be NULL if the executor determines that the output is all-valid, right?
Yes.
Even if we do not pre-allocate in INTERSECTION
mode, the validity-buffer will also be pre-allocated in ScalarExecutor::ExecuteNonSpans
or VectorExecutor::Exec
through PropagateNulls
.
ExecuteNonSpans
:
https://github.com/apache/arrow/blob/237987ad6ab8ac95f15d6d9aa825e7ccfa2a1d99/cpp/src/arrow/compute/exec.cc#L914-L915
PropagateNulls
: will do the same null check with NullGeneration
and do allocate validity-buffer again if not allocated.
https://github.com/apache/arrow/blob/237987ad6ab8ac95f15d6d9aa825e7ccfa2a1d99/cpp/src/arrow/compute/exec.cc#L546-L553
Therefore, the pre-allocation logic of ScalarExecutor should be kept consistent with that of VectorExecutor, and the logic of pre-allocating validity-buffer should be retained in INTERSECTION
mode.
So, the concern is that the validity bitmap may be NULL if the executor determines that the output is all-valid, right?
Yes.
But having a NULL validity bitmap when the output is determined to be all-valid is a feature, not a bug. There's nothing that needs to be changed here.
Therefore, the pre-allocation logic of ScalarExecutor should be kept consistent with that of VectorExecutor
I don't think consistency is a goal here, since there are important differences between what is possible in a scalar and vector kernel.
PropagateNulls
: will do the same null check withNullGeneration
and do allocate validity-buffer again if not allocated.
Well... if PropagateNulls
allocates a validity bitmap even when we know there will be 0 nulls, it's the bug that needs to be fixed :-)
But having a NULL validity bitmap when the output is determined to be all-valid is a feature, not a bug. There's nothing that needs to be changed here.
Yes, wes mentioned this:
// Used to account for the case where we do not preallocate a
// validity bitmap because the inputs are all non-null and we're
// using NullHandling::INTERSECTION to compute the validity bitmap
bool elide_validity_bitmap_ = false;
ScalarExecutor support this feature by check NullGeneralization::Get
in SetupPreallocation
, but VectorExecutor
not support this for now and do preallocate for NULL
validity bitmap in PropagateNulls
, i think we should support/fix this for VectorExecutor
.
Well... if PropagateNulls allocates a validity bitmap even when we know there will be 0 nulls, it's the bug that needs to be fixed :-)
I raised the concern as I worried about the consequence to existing compute kernels. I still think a more sane approach is to make pre-allocation always on with INTERSECTION
and ask kernels that can handle every case correctly to simply adopt NO_PREALLOCATE
.
If we want to ensure that pre-allocation for validity-bitmap is easy to use for kernels and prevent latent bugs, then for NullHandling
, only the COMPUTED_PREALLOCATE
and COMPUTED_NO_PREALLOCATE
strategies are retained, and the remaining two strategies seem unnecessary?
For INTERSECTION
, I understand that kernels developers are not clear whether the input data is ALL_VALID
or PERHAPS_NULL
, so the validity-bitmap allocation is left to the Executor itself -:)
No. That's not a fair inference on what I said in the PR.
I think INTERSECTION
handling should check the types of the input arrays and reason about the intersection with Type::NA
and other types that don't have validity bitmaps like unions.
Generic kernel dispatching should as much as possible be type dependent and not value dependent to (1) make keep dispatching decidable (non-Turing Complete) and (2) more amenable to static analysis as a consequence of (1).
Another argument (more practical) in favor of making pre-allocation fully determined by the types: during kernel dispatching we're looking at the lazily computed null_count
which might be kUnknownNullCount
non-deterministically. Calling GetNullCount()
on all the inputs would be wasteful because it's probably better to let the kernel use the pre-allocated validity bitmap to compute the intersection together with its own loop.
To summarize my opinion: it's OK for a kernel that takes Type::NA
to get an empty validity bitmap in the pre-allocated out
when null_handling
is INTERSECTION
, but it's not OK for kernels that take nullable types to non-deterministically get empty validity buffers.
@pitrou might feel like this is a risk worth taking and we should demand that kernels handle non-determinism correctly.
Describe the bug, including details regarding any error messages, version, and platform.
For now, the validity-buffer for output-data may not be preallocated with
NullHandling::INTERSECTION
according the input-data'sNullGeneration
. https://github.com/apache/arrow/blob/fecd207e5dd553d226046f9d7bc5cd1a05529781/cpp/src/arrow/compute/exec.cc#L943-L951We've talked about this in https://github.com/apache/arrow/pull/41975#discussion_r1638523247.
If some future implementations of kernel function believe that
INTERSECTION
mode must pre-allocate a validity-buffer, and rely on this to build their own logic, then the currentScalarExecutor
pre-allocation strategy for validity-buffer inINTERSECTION
mode may cause potentialSIGSEGV
problems.Component(s)
C++