Open chenbaggio opened 2 days ago
Would you mind provide the crash stack? Seems that TransposeInts
itself should not handle validity logic, perhaps the caller of TransposeInts
wrongly handles validity bitmap
the core stack has been removed, and here fix the issue using constant index value 0 instead of random value,
the core stack is in TransposeInts
, the code snippet:
···
dest[1] = static_cast
··· here the src[1] should be null, but here not check, the src[1] is random value, a massive number, so crash
here question is :
should check the null bitmap in here or handle it in application, make sure the index would not overflow ?
another fix version is implement the version which check the bitmap:
template <typename InputInt, typename OutputInt>
void TransposeIntsWithNullBitmap(const InputInt* src, OutputInt* dest, int64_t length,
const int32_t* transpose_map, const uint8_t * null_bitsmap, int64_t null_offset) {
if (null_bitsmap != nullptr) {
int64_t idx = 0;
while (length > 0) {
if (bit_util::GetBit(null_bitsmap, (null_offset + idx))) {
*dest = static_cast<OutputInt>(transpose_map[*src]);
}
--length, ++src, ++idx, ++dest;
}
} else {
TransposeInts(src, dest, length, transpose_map);
}
}
template <typename SrcType>
struct TransposeIntsDestWithNullBitmap {
const SrcType* src;
uint8_t* dest;
int64_t dest_offset;
int64_t length;
const int32_t* transpose_map;
const uint8_t* null_bitmap;
int64_t null_offset;
template <typename T>
enable_if_integer<T, Status> Visit(const T&) {
using DestType = typename T::c_type;
TransposeIntsWithNullBitmap(src, reinterpret_cast<DestType*>(dest) + dest_offset, length,
transpose_map, null_bitmap, null_offset);
return Status::OK();
}
Status Visit(const DataType& type) {
return Status::TypeError("TransposeInts received non-integer dest_type");
}
Status operator()(const DataType& type) { return VisitTypeInline(type, this); }
};
struct TransposeIntsSrcWithNullBitmap {
const uint8_t* src;
uint8_t* dest;
int64_t src_offset;
int64_t dest_offset;
int64_t length;
const int32_t* transpose_map;
const DataType& dest_type;
const uint8_t *null_bitmap;
template <typename T>
enable_if_integer<T, Status> Visit(const T&) {
using SrcType = typename T::c_type;
return TransposeIntsDestWithNullBitmap<SrcType>{reinterpret_cast<const SrcType*>(src) + src_offset,
dest, dest_offset, length,
transpose_map, null_bitmap, src_offset}(dest_type);
}
Status Visit(const DataType& type) {
return Status::TypeError("TransposeInts received non-integer dest_type");
}
Status operator()(const DataType& type) { return VisitTypeInline(type, this); }
};
Status TransposeIntsWithNullBitmap(const DataType& src_type, const DataType& dest_type,
const uint8_t* src, uint8_t* dest, int64_t src_offset,
int64_t dest_offset, int64_t length, const int32_t* transpose_map,
const uint8_t *null_bitmap) {
TransposeIntsSrcWithNullBitmap transposer{src, dest, src_offset, dest_offset,
length, transpose_map, dest_type, null_bitmap};
return transposer(src_type);
}
I mean, the internal::TransposeInts
might not designed for null-inputs, so, the I think TransposeIntsWithNullBitmap is a good way, but perhaps we need to find which caller triggers this problem.
There're TransposeDictIndices
and concatenate
which calls TransposeInts
, so i mean which caller might trigger this problem?
TransposeDictIndices
is the trigger, you can consturct a dictionary array with null value, and its index set to a massive value, the issue may reproduced
I've check the code of CompactTransposeMapVisitor
and RecursiveUnifier::Unify
, in tuition they didn't generate invalid ranges... I don't know which caller would make this happens 🤔, would you mind show the upper caller of this case, or add a test in this case?
here the test case:
read one group of data from database storage and construct a dictionary array dictionary ['ab', 'cd'] index [1, 1, 0, 0] => here index is int32 data type, in memory[0x00000001, 0x00000001, 0x00000000, 0x00000000] nullbitmap nullptr
read another group of data from database storage and construct a dictionary array dictionary ['ab', 'ef'] index [1, 1, 0, null] => here index is int32 data type, in memory[0x00000001, 0x00000001, 0x00000000, random_value] nullbitmap [0001]
NOTE: the group of data would be unified the dictionary with the ahead dictionary ["ab", "cd"]
here according to the code snippet:
in function `TransposeInts`
while (length >= 4) {
dest[0] = static_cast<OutputInt>(transpose_map[src[0]]);
dest[1] = static_cast<OutputInt>(transpose_map[src[1]]);
dest[2] = static_cast<OutputInt>(transpose_map[src[2]]);
dest[3] = static_cast<OutputInt>(transpose_map[src[3]]);
length -= 4;
src += 4;
dest += 4;
NOTE : src[3] is random value may be a massive number, if so, index overflow, then crash
@chenbaggio I may understand this. DictionaryArray::Transpose
seems to be regarded as a internal function, the input of it is expected to be handled by DictionaryUnifier
, which handles null
well...
I think the problem is const int32_t* transpose_map
given "non-null" values, which is unexpected. DictionaryUnifier
might fixes this?
Would this ( https://github.com/apache/arrow/pull/44867 ) helps?
Describe the enhancement requested
Due to some unify dictionary applications not handling the index value corresponding to null, a very large value may randomly occur. As a result, in the final
TransposeInts
function, src[index] may become a very large value, even exceeding the range of transpose_map, leading to a crash.”If this is a description of a bug, it suggests that the merge dictionary application should properly handle null indices to avoid generating invalid or out-of-bounds index values, which can cause crashes in later stages of the program
Component(s)
C++