Open shenyu0127 opened 1 year ago
Transform function works on columnar data, and we should always have a scalar function counterpart for every transform function so that we can do row based transform. Which transform function do you find that doesn't have the scalar function counterpart?
We didn't find any transform function missing the scalar function counterpart. We did not assume every transform function has a scalar function counterpart and assumed the feature gap.
If we can guarantee that every transform function has a scalar function counterpart, then:
- Why don't we make the transform functions depend on the scalar functions? The performance overhead should be minimal. The benefit is we ensure the two function won't diverge.
Think of TransformFunction
as a specialized implementation for better performance. There is a ScalarTransformFunctionWrapper
which leverages the scalar functions so that if there is no specialized implementation for a scalar function, we fall back to using the scalar function. The performance overhead is not trivial because it always involves boxing the value, casting the type, and method invocation on a per record base.
I am not proposing to use the combination ofScalarTransformFunctionWrapper
and scalar functions to replace transform functions. I am proposing to make the transform functions call their corresponding scalar functions, e.g. make DateTruncTransformFunction
call DateTimeFunctions::datetrunc
.
@npawar and I found this feature gap.
The Broker preprocess the query by invoking functions if all the function arguments are literals, and then passes the preprocessed query to the Server for query execution.
When we preprocess the query, we only support scalar functions. We should also support transform functions.