databendlabs / databend

𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
https://docs.databend.com
Other
7.87k stars 751 forks source link

bug: Unnest still does not work for some cases #16873

Open rad-pat opened 2 days ago

rad-pat commented 2 days ago

Search before asking

Version

v1.2.659-nightly

What's Wrong?

Following on from https://github.com/databendlabs/databend/issues/16797 The SQL below still does not work

create or replace table t1 (col1 string, col2 string);
insert into t1 values ('test', 'a1,a2');

-- does not work
select t.col1 as col1, unnest(split(t.col2, ',')) as col2
from t1 as t
group by t.col1 as col1, unnest(split(t.col2, ',')) as col2;

And also this SQL additionally does not work

select distinct unnest(split(coalesce(NULL, 'N/A'), ',')) as c1

How to Reproduce?

No response

Are you willing to submit PR?

b41sh commented 1 day ago

Hi @rad-pat thanks for report this bug. AS can't used in group by clause. you can write the SQL like this

select t.col1 as col1, unnest(split(t.col2, ',')) as col2
from t1 as t
group by col1, col2;
rad-pat commented 1 day ago

Thanks @b41sh. The AS clause in GROUP BY is processed fine in Postgres/Greenplum. It is what is created from SQLAlchemy output. I will have to see if we can adjust this somehow.

rad-pat commented 1 day ago

Sorry, it seems that SQLAlchemy does not actually output the alias, but it does output the expression and it seems like that fails:

Also, the docs suggest that expression can be used in GROUP BY clause

select t.col1 as col1, unnest(split(t.col2, ',')) as col2
from t1 as t
group by t.col1, unnest(split(t.col2, ','));

SQL Error: Error executing query: SQL: select t.col1 as col1, unnest(split(t.col2, ',')) as col2
from t1 as t
group by t.col1, unnest(split(t.col2, ',')) Query failed: QueryErrors{code=1065, message=error: 
  --> SQL:3:18
  |
1 | select t.col1 as col1, unnest(split(t.col2, ',')) as col2
2 | from t1 as t
3 | group by t.col1, unnest(split(t.col2, ','))
  |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^ set-returning functions can only be used in SELECT

} cause: null
b41sh commented 1 day ago

Sorry, it seems that SQLAlchemy does not actually output the alias, but it does output the expression and it seems like that fails:

Also, the docs suggest that expression can be used in GROUP BY clause

select t.col1 as col1, unnest(split(t.col2, ',')) as col2
from t1 as t
group by t.col1, unnest(split(t.col2, ','));

SQL Error: Error executing query: SQL: select t.col1 as col1, unnest(split(t.col2, ',')) as col2
from t1 as t
group by t.col1, unnest(split(t.col2, ',')) Query failed: QueryErrors{code=1065, message=error: 
  --> SQL:3:18
  |
1 | select t.col1 as col1, unnest(split(t.col2, ',')) as col2
2 | from t1 as t
3 | group by t.col1, unnest(split(t.col2, ','))
  |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^ set-returning functions can only be used in SELECT

} cause: null

This is indeed a problem, I will fix this to allow set returning function in the group by clause.