Seddryck / Tseesecake

Tseesecake is a lightweight proxy for time-series query engines, supporting multiple database engines and storage providers, with a SQL dialect dedicated to time series
https://seddryck.github.io/Tseesecake
Apache License 2.0
2 stars 1 forks source link

Implicit window aggregation for cumulated sum #68

Open Seddryck opened 11 months ago

Seddryck commented 11 months ago

https://github.com/Seddryck/Tseesecake/issues/70 is a prerequisite.

This set of aggregation is a shortcut preventing the definition of Window. These aggregations automatically define a window where "PARTITION BY" is equal to "GROUP BY" and "ORDER BY" is equal to "BUCKET BY/Timestamp ASC".

If the SQL

SELECT
    Instant,
    Producer,
    SUM(Forecasted)
FROM
    WindEnergy
BUCKET BY 
    Year
GROUP BY
    Producer

is returning the following table

Instant Producer Forecasted
2022-01-01 Green Power Inc. 2737.176288
2023-01-01 Green Power Inc. 17782.837022
2022-01-01 Future Energy 1245.309402
2023-01-01 Future Energy 6739.813180

The following SQL

SELECT
    Instant,
    Producer,
    CUMULATED_SUM(Forecasted)
FROM
    WindEnergy
BUCKET BY 
    Year
GROUP BY
    Producer
Instant Producer Forecasted
2022-01-01 Green Power Inc. 2737.176288
2023-01-01 Green Power Inc. 20520.013310
2022-01-01 Future Energy 1245.309402
2023-01-01 Future Energy 7985.122582

The query translated to the dialect supported by DuckDB should be

SELECT
    DateTrunc('year', Instant) as Instant,
    Producer,
    SUM(SUM(Forecasted)) OVER (PARTITION BY Producer ORDER BY DateTrunc('year', Instant) ASC) as Forecasted  
FROM
    WindEnergy
GROUP BY 
    DateTrunc('year', Instant), producer