mithrandie / csvq

SQL-like query language for csv
https://mithrandie.github.io/csvq
MIT License
1.5k stars 65 forks source link

group by issue #35

Closed alitrack closed 2 years ago

alitrack commented 4 years ago

the following 3 sql do not work,

select year(SalesDate) as yr,  count(*) from sales group by 1
select year(SalesDate) as yr,  count(*) from sales group by yr
select year(SalesDate) as yr,  count(*) from sales group by year(SalesDate) 

this one works,

select yr, count(*) as cnt from (select year(SalesDate) as yr from sales) group by yr

btw, It's csvq is Postgres/MySQL/SQLite compatible, had better Postgres(most big data support it).

mithrandie commented 4 years ago

First, GROUP BY clause is executed before SELECT clause, so the alias defined in the SELECT clause cannot be used in the GROUP BY clause.

Second, when you use GROUP BY clause, the SELECT clause allows you to specify only the columns specified as the grouping key. Other columns may have multiple values, so it is necessary to explicitly specify unique values using aggregate functions and so on. e.g.

select year(MAX(SalesDate)) as yr,  count(*) from sales group by 1;

The third query(select year(SalesDate) as yr, count(*) from sales group by year(SalesDate)) might be something to consider implementation.

alitrack commented 4 years ago

group by 1 is useless in this SQL,

select year(MAX(SalesDate)) as yr,  count(*) from sales group by 1;

this one works too, and the result is same too.

select year(MAX(SalesDate)) as yr,  count(*) from sales;

what I want is count the order by year( how many rows in each year).

mithrandie commented 4 years ago

The expression inside the aggregate function computes for all the values of the grouped rows. So in that case, you can use the following query.

select MAX(year(dateid)) as yr, count(*) from winsales group by year(dateid)