Open yibin87 opened 1 year ago
One candidate for insertFrom is Join prob case:
static bool addFound(const typename Map::SegmentType::HashTable::ConstLookupResult & it, size_t num_columns_to_add, MutableColumns & added_columns, size_t i, IColumn::Filter * filter, IColumn::Offset & /*current_offset*/, IColumn::Offsets * /*offsets*/, const std::vector<size_t> & right_indexes, ProbeProcessInfo & /*probe_process_info*/)
{
(*filter)[i] = 1;
for (size_t j = 0; j < num_columns_to_add; ++j)
added_columns[j]->insertFrom(*it->getMapped().block->getByPosition(right_indexes[j]).column.get(), it->getMapped().row_num);
return false;
}
Probing one row, while invokes n columns' virtual insertFrom function.
Enhancement
Currently, the scatter/scatterTo default implementation is not locality friendly:
Current implementation sacrifies the dst column access locality to achieve sequential element access for src column. For the worst case, the dst column would change for every insertion. When dst columns' size becomes larger, the problem might get worse. It can be optimized like this:
Currently, IColumn provide insertFrom and insertRangeFrom interface, first for scalar insertion and second for range insertion.
For non-continuous mutilple insertions, for example, insert [0,2,4,6,8,10,.....,2n]th element, currently, we have to write like this:
However, insertFrom is a virtual function, that means one virtual call for one element insertion which is not efficient. It can be improved by providing a insertMultipleFrom(const IColumn & src, std::vector & index_vec).