heavyai / heavydb

HeavyDB (formerly OmniSciDB)
https://heavy.ai
Apache License 2.0
2.94k stars 446 forks source link

Guidance on writing UDF window functions #514

Open rhyswhitley opened 4 years ago

rhyswhitley commented 4 years ago

Hello, I was wondering if there is any intention to provide more advanced examples of using UDFs to provide some of the missing aggregation functions that we typically get in the other flavours of SQL?

In particular it would be nice to have access to a greater range of statistical functions, e.g. quantiles, weighted averaging, etc. While I can implement a UDF to do a simple sum of across columns: x + y, I am unsure what I would need to write to be able to perform an aggregation. For example, to get the average would it be something like?:

EXTENSION_NOINLINE
double my_average(const Array<double> arr) {
    double mean_, sum_, size_;
    size_ = static_cast<double>(arr.getSize())
    sum_ = 0
    for (int32_t i = 0; i < arr.getSize(); i++) {
        sum_ += arr(i);
    }
    mean_ = sum_/(size_ + 1.)
    return mean_;
}

I suppose this is more of a feature/support request than an issue.

alexbaden commented 4 years ago

Hi @rhyswhitley,

We do not yet support user defined aggregate functions. However, with our recently landed result set reduction JIT, this is mostly an exercise in plumbing. It is on our road map but I don’t have a firm ETA at the moment.

pavitrakumar78 commented 4 years ago

@alexbaden Hi! has user-defined aggregate functions been implemented yet? I'm specifically interested in averaging functions - exponential/simple average etc.,