br1ghtyang / asterixdb

Automatically exported from code.google.com/p/asterixdb
0 stars 0 forks source link

Group-by sum/avg/min/max: the query is not intuitive? #533

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Currently if we want to do a group-by for sum/avg/min/max over a field, we have 
to have a nested query to flat the list:

for $t in dataset UserVisit
group by $ip := $t.ip with $t
return {
    "ip": $ip,
    "total": sum(for $i in $t return $i.revenue)
}

Probably I miss something here, but my understanding is that the group-by 
returns a list for each group, then the sum function cannot directly access the 
"revenue" field using "sum($i.revenue)". 

The reason that I think it is not intuitive is because:
- in our document we always use "count", which does not need to access any field
- it may be a little bit hard for a fresh user to get the trick to add that 
"for ... return " clause inside of the aggregate function.

Probably we should first address this in our document, so at least users can 
get some hint on how to do this. Then we may want to think a more intuitive 
way? 

Original issue reported on code.google.com by jarod...@gmail.com on 14 Jun 2013 at 6:00

GoogleCodeExporter commented 8 years ago
Actually, all we need is to make the '.' operator a map when applied on a 
sequence (if it isn't already). 
Then the query looks like this:

for $t in dataset UserVisit
group by $ip := $t.ip with $t
return { "ip": $ip, "total": sum($t.revenue) }

Original comment by westm...@gmail.com on 15 Jun 2013 at 5:58

GoogleCodeExporter commented 8 years ago
I think this should already work, but doesn't, and that Yingi is on this
issue?  (I think this is a dup?)

Original comment by dtab...@gmail.com on 15 Jun 2013 at 6:01