Open snth opened 3 months ago
I wanted to try to bisect this, but I can't reproduce a successful compile of your PRQL query going back to version 0.9.0.
SQL output from 0.9.0:
WITH RECURSIVE table_1 AS (
SELECT
1 AS ts,
1 AS value
UNION
ALL
SELECT
2 AS ts,
12 AS value
UNION
ALL
SELECT
3 AS ts,
13 AS value
UNION
ALL
SELECT
4 AS ts,
5 AS value
UNION
ALL
SELECT
5 AS ts,
1 AS value
),
data AS (
SELECT
ts,
value
FROM
table_1
),
table_3 AS (
SELECT
ts AS t,
value AS x
FROM
data
),
table_0 AS (
SELECT
_expr_0,
_expr_1,
_expr_1 AS _expr_2
FROM
table_3
WHERE
_expr_0 = 1
UNION
ALL
SELECT
table_4.t AS _expr_3,
table_4.x AS _expr_4,
(2 / (3 + 1)) * table_4.x + (1 - (2 / (3 + 1))) * table_0._expr_2
FROM
table_0
JOIN (
SELECT
ts AS t,
value AS x
FROM
data
) AS table_4 ON table_4.t = table_4.t + 1
)
SELECT
t AS ts,
x AS value,
_expr_2 AS value_ewma3
FROM
table_0
-- Generated by PRQL compiler version:0.9.0 (https://prql-lang.org)
I didn't go back further than that, as the syntax changes introduced in 0.9.0 would require rewriting your query.
Can you confirm a version of PRQL that does successfully compile?
Thanks @kgutwin and sorry about the broken repro. I tried to create a self-contained repro from the stock price ewma example from this Google Colab notebook I use in presentations: https://colab.research.google.com/drive/146ohCI_WlnOKpQQ6sPMWu4uKGqZvwaIZ?usp=sharing
Looks like the column aliasing in select {t=ts, x=value}
breaks it even in 0.9.5. It works if you remove that part, in 0.9.5 and 0.12.2 🤦♂️. So it looks like that's not the regression bug part but something else is because my actual example did stop working at some point which is why I've got it pinned to 0.9.5. Let me see if I can isolate that.
Here's the working example in the meantime (same calc_ewma
function as above but different data):
let data = [
{t=1, x=1},
{t=2, x=12},
{t=3, x=13},
{t=4, x=5},
{t=5, x=1},
]
from data
calc_ewma 3
The following also works (in both versions):
let data = [
{t=1, x=1},
{t=2, x=12},
{t=3, x=13},
{t=4, x=5},
{t=5, x=1},
]
from data
calc_ewma 3
derive x_ewma3=ewma
select !{ewma}
but this doesn't (in both):
let data = [
{ts=1, y=1},
{ts=2, y=12},
{ts=3, y=13},
{ts=4, y=5},
{ts=5, y=1},
]
from data
derive {t=ts, x=y}
calc_ewma 3
derive x_ewma3=ewma
select !{t, x, ewma}
Btw, this last version is really the example we want (using derive {t=...} ... select !{...}
) because that does the necessary aliasing to use the function and then leaves the original relation with the original columns. (Actually, that function drops other columns in the source relation so perhaps the result should be joined onto the original but that's a separate issue.)
Still trying to track down the original regression bug.
Sorry, looks like it's actually one of the other examples that breaks
let list_subaccounts = root_key accounts -> (
from parent=accounts
filter parent_account_key==root_key
select {parent.account_key, parent.account_name, account_path=parent.account_name, level=0}
loop (
join child=accounts (child.parent_account_key==parent.account_key)
select {child.account_key, child.account_name, account_path=f"{account_path} / {child.account_name}", level=level+1}
)
select {level, account_name, account_path}
)
from balance_sheet
list_subaccounts 0 # Balance Sheet
# list_subaccounts 1272 # Assets
In 0.9.5 this produces:
WITH RECURSIVE table_0 AS (
SELECT
account_key,
account_name,
account_name AS account_path,
0 AS level
FROM
balance_sheet AS parent
WHERE
parent_account_key = 0
UNION
ALL
SELECT
child.account_key AS _expr_0,
child.account_name AS _expr_1,
CONCAT(table_0.account_path, ' / ', child.account_name),
table_0.level + 1
FROM
table_0
JOIN balance_sheet AS child ON child.parent_account_key = table_0.account_key
)
SELECT
level,
account_name,
account_path
FROM
table_0
-- Generated by PRQL compiler version:0.9.5 (https://prql-lang.org/)
From 0.10 onwards it produces the following error:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/IPython/core/interactiveshell.py", line 3553, in run_code
async_ : Bool (Experimental)
File ["<ipython-input-2-acfd2b139886>"](https://localhost:8080/#), line 2, in <cell line: 2>
print(prql.compile("""
File "<string>", line unknown
SyntaxError: Error:
╭─[:8:13]
│
8 │ select {child.account_key, child.account_name, account_path=f"{account_path} / {child.account_name}", level=level+1}
│ ────────┬────────
│ ╰────────── Unknown name `child.account_key`
───╯
It took me a long time to track this down because when you actually include some sample data (as you would for a repro) then it works and the error doesn't show up, e.g.
let balance_sheet = from_text format:csv """
account_key,parent_account_key,account_name
1221,0000,Balance Sheet
1272,1221,Assets
1300,1272,Current Assets
1301,1300,Cash & Cash Equivalent
1302,1301,Bank Accounts
1307,1300,Inventories
"""
from balance_sheet
list_subaccounts 0 # Balance Sheet
works.
Interestingly, without the function definition it compiles on 0.12.2:
from parent=accounts
filter parent_account_key==root_key
select {parent.account_key, parent.account_name, account_path=parent.account_name, level=0}
loop (
join child=accounts (child.parent_account_key==parent.account_key)
select {child.account_key, child.account_name, account_path=f"{account_path} / {child.account_name}", level=level+1}
)
select {level, account_name, account_path}
produces
WITH RECURSIVE table_0 AS (
SELECT
account_key,
account_name,
account_name AS account_path,
0 AS level
FROM
accounts AS parent
WHERE
parent_account_key = root_key
UNION
ALL
SELECT
child.account_key AS _expr_0,
child.account_name AS _expr_1,
CONCAT(table_0.account_path, ' / ', child.account_name),
table_0.level + 1
FROM
table_0
JOIN accounts AS child ON child.parent_account_key = table_0.account_key
)
SELECT
level,
account_name,
account_path
FROM
table_0 AS table_1
-- Generated by PRQL compiler version:0.12.2 (https://prql-lang.org)
I'd argue that the column aliases in the recursive part of the query are wrong but DuckDB doesn't seem to mind and it produces the correct result set if you run the whole example:
let accounts = from_text format:csv """
account_key,parent_account_key,account_name
1221,0000,Balance Sheet
1272,1221,Assets
1300,1272,Current Assets
1301,1300,Cash & Cash Equivalent
1302,1301,Bank Accounts
1307,1300,Inventories
"""
let root_key=0
from parent=accounts
filter parent_account_key==root_key
select {parent.account_key, parent.account_name, account_path=parent.account_name, level=0}
loop (
join child=accounts (child.parent_account_key==parent.account_key)
select {child.account_key, child.account_name, account_path=f"{account_path} / {child.account_name}", level=level+1}
)
select {level, account_name, account_path}
For the record, the correct SQL for the recursive section should probably be something like the following:
...
UNION
ALL
SELECT
child.account_key,
child.account_name,
CONCAT(table_0.account_path, ' / ', child.account_name) AS account_path,
table_0.level + 1 AS level
FROM
table_0
JOIN accounts AS child ON child.parent_account_key = table_0.account_key
...
I've changed the name of the issue.
Should I delete the first few EWMA examples since they actually work fine and are a lot to read which doesn't add to the actual problem description?
What happened?
There's a regression in PRQL version:0.12.2 in that some
loop
examples no longer compile correctly.The problems are:
_expr_0
,_expr_1
,_expr_2
,_expr_3
, and_expr_4
instead oft
,x
, andewma
,table_2.t
in) AS table_5 ON table_5.t = table_5.t + 1
.The expected output runs correctly in shell.duckdb.org.
PRQL input
SQL output
Expected SQL output
MVCE confirmation
Anything else?
This might have been introduced in version 0.10 or 0.11 already as I noticed my examples failing to compile then. I've just been pinning to 0.9.5 since then.