Open philrz opened 1 week ago
For comparison, here's a SQL approach to get the same result. I confess I had ChatGPT come up with this, so maybe there's a better way.
$ duckdb --version
v1.1.2 f680b7d08f
$ seq 6 | duckdb -c "
CREATE TEMP TABLE NumberedRows AS
SELECT
column0 AS value,
ROW_NUMBER() OVER () AS rn
FROM
read_csv('/dev/stdin');
SELECT
(n1.value + n2.value) AS sum_value
FROM
NumberedRows n1
JOIN
NumberedRows n2 ON n1.rn = n2.rn - 1
WHERE
n1.rn % 2 = 1;"
┌───────────┐
│ sum_value │
│ int64 │
├───────────┤
│ 3 │
│ 7 │
│ 11 │
└───────────┘
tl;dr
A proposed new
batch
operator:Details
Repro is with
super
commit 0af34e0.A community user recently asked about this in the same Slack thread that spawned this comment. In their original question they were attempting to operate on just two back-to-back input values, but then widened their inquiry to the more general case of needing to handle an indefinite number of back-to-back input vales of a similar nature.
Let's use the example of 6 incrementing numbers.
Using what's in the language currently, much like the response to the user's original question, it seems this would require some initial aggregation of the input pairs into a complex value and then picking apart the complex values, potentially using
over
and a lateral subquery. For instance if we wanted to add each pairing of numbers together:However, @mccanne pointed out that for a use case like this a purpose-built operator like
batch
proposed above would provide a more direct and intuitive way to get the values into the lateral subquery.