namhnguyen / asterixdb

Automatically exported from code.google.com/p/asterixdb
0 stars 0 forks source link

Incorrect nested aggregation result #786

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
Run query
use dataverse TPCH;

for $nation in dataset nation
for $sn in dataset selectedNations
where $nation.n_nationkey = $sn.sn_nationkey  /*+ indexnl */
return {
  "nation_key": $nation.n_nationkey,
  "name": $nation.n_name,
  "aggregates": for $order in dataset orders
                for $customer in dataset customers
                where $order.o_custkey = $customer.c_custkey
                and  $customer.c_nationkey = $nation.n_nationkey
                group by $orderdate := $order.o_orderdate with $order
                let $sum := sum(for $o in $order return $o.o_totalprice)
                order by $sum desc
                limit 3
                return {
                  "order_date": $orderdate,
                  "sum_price": $sum
                }
}

Over dataset
drop dataverse TPCH if exists;
create dataverse  TPCH;
use dataverse TPCH;

create type nationType as open {
  n_nationkey: int32,
  n_name: string,
  n_regionkey: int32,
  n_comment: string
}

create dataset nation(nationType)
primary key n_nationkey;

create type customersType as open {
  c_custkey: int32,
  c_name: string,
  c_address: string,
  c_nationkey: int32,
  c_phone: string,
  c_acctbal: double,
  c_mktsegment: string,
  c_comment: string
}

create dataset customers(customersType)
primary key c_custkey;

create type ordersType as open {
  o_orderkey: int32,
  o_custkey: int32,
  o_orderstatus: string,
  o_totalprice: double,
  o_orderdate: string,
  o_orderpriority: string,
  o_clerk: string,
  o_shippriority: int32,
  o_comment: string
}

create dataset orders(ordersType)
primary key o_orderkey;

load dataset nation using localfs
(("path"="localhost:///Users/yupengf/asterix-mgmt/data/nation.adm"),("format"="a
dm"));

load dataset customers using localfs
(("path"="localhost:///Users/yupengf/asterix-mgmt/data/customers.adm"),("format"
="adm"));

load dataset orders using localfs
(("path"="localhost:///Users/yupengf/asterix-mgmt/data/orders.adm"),("format"="a
dm"));

create type selectedNationType as open {
  sn_nationkey: int32
}

create dataset selectedNations(selectedNationType)
primary key sn_nationkey;

insert into dataset selectedNations (
 {"sn_nationkey":1}
)
insert into dataset selectedNations (
 {"sn_nationkey":2}
)
insert into dataset selectedNations (
 {"sn_nationkey":3}
)insert into dataset selectedNations (
 {"sn_nationkey":4}
)insert into dataset selectedNations (
 {"sn_nationkey":5}
)
insert into dataset selectedNations (
 {"sn_nationkey":6}
)
insert into dataset selectedNations (
 {"sn_nationkey":7}
)insert into dataset selectedNations (
 {"sn_nationkey":8}
)insert into dataset selectedNations (
 {"sn_nationkey":9}
)insert into dataset selectedNations (
 {"sn_nationkey":10}
)

What is the expected output? What do you see instead?
The result shows with aggregation results for each tuple. However, it only has 
result for the first tuple. All the rest are nested empty collection:

{ "nation_key": 1, "name": "ARGENTINA                ", "aggregates": [ { 
"order_date": "1996-05-02", "sum_price": 1.6282215789999997E7d }, { 
"order_date": "1993-08-09", "sum_price": 1.6174823010000004E7d }, { 
"order_date": "1998-07-03", "sum_price": 1.6106452180000002E7d } ] }
{ "nation_key": 2, "name": "BRAZIL                   ", "aggregates": [  ] }
{ "nation_key": 3, "name": "CANADA                   ", "aggregates": [  ] }
{ "nation_key": 4, "name": "EGYPT                    ", "aggregates": [  ] }
{ "nation_key": 5, "name": "ETHIOPIA                 ", "aggregates": [  ] }
{ "nation_key": 6, "name": "FRANCE                   ", "aggregates": [  ] }
{ "nation_key": 7, "name": "GERMANY                  ", "aggregates": [  ] }
{ "nation_key": 8, "name": "INDIA                    ", "aggregates": [  ] }
{ "nation_key": 9, "name": "INDONESIA                ", "aggregates": [  ] }
{ "nation_key": 10, "name": "IRAN                     ", "aggregates": [  ] }

I can send some toy data, if needed.

Original issue reported on code.google.com by richb...@gmail.com on 10 Jun 2014 at 6:44

GoogleCodeExporter commented 9 years ago
Can you try the latest master branch of hyracks and asterixdb?

We have just merged back the fix for issue782,  which includes another fix 
related to join  ordering in subplans.

Original comment by buyingyi@gmail.com on 10 Jun 2014 at 6:53

GoogleCodeExporter commented 9 years ago
This is because the limit operator doesn't clear its state, therefore it 
doesn't work in nested plan.

Original comment by buyingyi@gmail.com on 12 Jun 2014 at 6:04

GoogleCodeExporter commented 9 years ago
Fixed in yingyi/asterix_test

Original comment by buyingyi@gmail.com on 12 Oct 2014 at 7:42

GoogleCodeExporter commented 9 years ago

Original comment by buyingyi@gmail.com on 13 Oct 2014 at 7:55