This was reported by user in community slack thread
Describe the unexpected behaviour
A clear and concise description of what works not as it is supposed to.
How to reproduce
We are using v24.8.4.13-lts version
This is the query which runs successfully in RMT table but fails on Distributed table
SELECT
COUNT(CASE WHEN platform = 'web' THEN 1 ELSE NULL END) AS base_web_cnt
FROM events_dist as base
prewhere base.id = XXX AND (base.eventTimeMs BETWEEN '2024-01-01 00:00:00' AND '2024-01-02 00:00:00')
LIMIT 1
It fails with
Cannot find column `countIf(_CAST(1_UInt8, 'Nullable(UInt8)'_String), equals(__table1.platform, 'web'_String))` in source stream, there are only columns: [countIf(_CAST(1_Nullable(UInt8), 'Nullable(UInt8)'_String), equals(__table1.platform, 'web'_String))]. (THERE_IS_NO_COLUMN) (version 24.8.4.13 (official build))
The column comparison showed a difference between the column name from the header and the column name from the stream:
ActionsDAG ActionsDAG::makeConvertingActions(
const ColumnsWithTypeAndName & source, // countIf(_CAST(1_Nullable(UInt8), 'Nullable(UInt8)'_String), equals(__table1.platform, 'web'_String))
const ColumnsWithTypeAndName & result, // countIf(_CAST(1_UInt8, 'Nullable(UInt8)'_String), equals(__table1.platform, 'web'_String))
MatchColumnsMode mode,
bool ignore_constant_values,
bool add_casted_columns,
NameToNameMap * new_names)
{
size_t num_input_columns = source.size();
size_t num_result_columns = result.size();
if (mode == MatchColumnsMode::Position && num_input_columns != num_result_columns)
throw Exception(ErrorCodes::NUMBER_OF_COLUMNS_DOESNT_MATCH, "Number of columns doesn't match (source: {} and result: {})", num_input_columns, num_result_columns);
if (add_casted_columns && mode != MatchColumnsMode::Name)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Converting with add_casted_columns supported only for MatchColumnsMode::Name");
ActionsDAG actions_dag(source);
NodeRawConstPtrs projection(num_result_columns);
FunctionOverloadResolverPtr func_builder_materialize = std::make_unique<FunctionToOverloadResolverAdaptor>(std::make_shared<FunctionMaterialize>());
std::unordered_map<std::string_view, std::list<size_t>> inputs;
if (mode == MatchColumnsMode::Name)
{
size_t input_nodes_size = actions_dag.inputs.size();
for (size_t pos = 0; pos < input_nodes_size; ++pos)
inputs[actions_dag.inputs[pos]->result_name].push_back(pos);
}
for (size_t result_col_num = 0; result_col_num < num_result_columns; ++result_col_num)
{
const auto & res_elem = result[result_col_num];
const Node * src_node = nullptr;
const Node * dst_node = nullptr;
switch (mode)
{
case MatchColumnsMode::Position:
{
src_node = dst_node = actions_dag.inputs[result_col_num];
break;
}
case MatchColumnsMode::Name:
{
auto & input = inputs[res_elem.name];
if (input.empty())
{
const auto * res_const = typeid_cast<const ColumnConst *>(res_elem.column.get());
if (ignore_constant_values && res_const)
src_node = dst_node = &actions_dag.addColumn(res_elem);
else
throw Exception(ErrorCodes::THERE_IS_NO_COLUMN,
// res_elem.name is: countIf(_CAST(1_UInt8, 'Nullable(UInt8)'_String), equals(__table1.platform, 'web'_String))
// Block(source) is: countIf(_CAST(1_Nullable(UInt8), 'Nullable(UInt8)'_String), equals(__table1.platform, 'web'_String))
"Cannot find column `{}` in source stream, there are only columns: [{}]",
res_elem.name, Block(source).dumpNames());
}
void addConvertingActions(QueryPlan & plan, const Block & header, bool has_missing_objects)
{
if (blocksHaveEqualStructure(plan.getCurrentDataStream().header, header))
return;
auto mode = has_missing_objects ? ActionsDAG::MatchColumnsMode::Position : ActionsDAG::MatchColumnsMode::Name;
auto get_converting_dag = [mode](const Block & block_, const Block & header_)
{
/// Convert header structure to expected.
/// Also we ignore constants from result and replace it with constants from header.
/// It is needed for functions like `now64()` or `randConstant()` because their values may be different.
/**
*
MatchColumnsMode mode,
*/
return ActionsDAG::makeConvertingActions(
block_.getColumnsWithTypeAndName(), // const ColumnsWithTypeAndName & source, // countIf(_CAST(1_Nullable(UInt8), 'Nullable(UInt8)'_String), equals(__table1.platform, 'web'_String))
header_.getColumnsWithTypeAndName(),// const ColumnsWithTypeAndName & result, // countIf(_CAST(1_UInt8, 'Nullable(UInt8)'_String), equals(__table1.platform, 'web'_String))
mode,
true);
};
auto convert_actions_dag = get_converting_dag(plan.getCurrentDataStream().header, header);
auto converting = std::make_unique<ExpressionStep>(plan.getCurrentDataStream(), std::move(convert_actions_dag));
plan.addStep(std::move(converting));
}
The CASE WHEN platform = 'web' THEN 1 ELSE NULL END should triggered a conversion to type Nullable(Nullable(UInt8)), thus the type in the stream is Nullable(UInt8), but the type in the header didn't apply such a conversion, thus have no such a Nullable(UInt8), just a UInt8
Not just prewhere, the failure is also happening with WHERE. can we prioritize this as we have to disable analyzer to avoid the error from happening for now? Thanks!
This was reported by user in community slack thread
Describe the unexpected behaviour A clear and concise description of what works not as it is supposed to.
How to reproduce
We are using v24.8.4.13-lts version
This is the query which runs successfully in RMT table but fails on Distributed table
It fails with
Stacktrace
The column is defined as
The column comparison showed a difference between the column name from the header and the column name from the stream:
The
CASE WHEN platform = 'web' THEN 1 ELSE NULL END
should triggered a conversion to typeNullable(Nullable(UInt8))
, thus the type in the stream isNullable(UInt8)
, but the type in the header didn't apply such a conversion, thus have no such aNullable(UInt8)
, just aUInt8
Link https://github.com/ClickHouse/ClickHouse/blob/v24.8.4.13-lts/src/Storages/MergeTree/MergeTreeSplitPrewhereIntoReadSteps.cpp#L279
Expected behavior Query should have run successfully over Distributed table