Open shenjianrong opened 6 days ago
I think that the way you want the udf_topN
function to work may be impractical in any PL, as you propose a function that is simply called in the select-list of a query, but ends up somehow influencing the ORDER BY
and LIMIT
of the whole query that contains it. Also, because this function is applied to a non-grouped column in a query with a GROUP BY
, it will need to be an aggregate function (I assume you intend the semantics of sum
).
It is easy enough to write an aggregate function in PL/Java, the difficulty still remains in having a function that is simply called within a query somehow affect the ORDER BY
and LIMIT
of the containing query. Even if such a function could somehow be written, such magical behavior would tend to astonish anyone trying to understand a query that uses it.
If you wanted instead to write an orders_topN
function that did the whole query, that would be simple. so simple PL/Java would hardly be necessary, and plain SQL would suffice:
CREATE OR REPLACE FUNCTION orders_topN(n integer)
RETURNS TABLE(product text, amount integer)
STABLE
BEGIN ATOMIC
SELECT product, total_amount
FROM (
SELECT product, sum(amount) AS total_amount
FROM orders
GROUP BY product
) AS q
ORDER BY total_amount*sign(n)
LIMIT abs(n);
END
$ select (orders_topN(1)).*;
product | amount
---------+--------
tokamak | 2
$ select (orders_topN(-1)).*;
product | amount
---------+--------
widget | 31
(The reason for the nested SELECT
is so that the outer ORDER BY
can use total_amount
in an expression (like total_amount*sign(n)
). When ORDER BY
just gives a name, that's allowed to be a column name in the result the query will create, but in ORDER BY
any kind of expression, the names in the expression must be things already known.
If what you want is a function that takes n and the name of a column, that will aggregate just the named column, grouping by all the others, and applying an order and limit, that would be trickier. The caller would have to supply the name of the column (when you just use the column in the function call as your example does, only a value is passed, and the function doesn't know where it came from). The function would then have to generate appropriate SQL and execute it.
If you wanted a function like that but polymorphic, so you could give it any table to operate on and have it return rows of that table's type (but grouped and with the named one aggregated), there might be a way to accomplish that, using PostgreSQL's support for polymorphic functions. It probably wouldn't be pretty, and might need a dummy argument of the table's row type. I'm not sure it would be worth the effort, except for the pure challenge of it.
The table orders has some fields as following: orderkey, product, amount, order_month, ...... I want to implement a UDF to use like: select product, udf_topN(amount, -5) from orders group by product this query means selecting the top 5 records by amount descending sort in each product group. Can I implement this function using PL/Java and how to do? Please tell me where I can find such documents or information.