br1ghtyang / asterixdb

Automatically exported from code.google.com/p/asterixdb
0 stars 0 forks source link

TPC-H AlgebricksException in q22 #562

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
To reproduce the issue, run tpch q22 in the web interface on SF=1 with 3 nodes, 
the default configuration, and one IO device.

The results that it should get, as well as the fixed version of q22, are 
attached.  Thanks in advance for the help.

Original issue reported on code.google.com by cmcintyr...@gmail.com on 9 Jul 2013 at 12:17

Attachments:

GoogleCodeExporter commented 8 years ago

Original comment by westm...@gmail.com on 19 Jul 2013 at 4:37

GoogleCodeExporter commented 8 years ago
The query modified to contain a nested query in a where clause causes a syntax 
error.  The check that the c_custkey is not in the Orders table is missing in 
the original aql version.

Original comment by cmcintyr...@gmail.com on 27 Jul 2013 at 12:57

Attachments:

GoogleCodeExporter commented 8 years ago
The modified query was ran under the same conditions.

Original comment by cmcintyr...@gmail.com on 27 Jul 2013 at 1:03

GoogleCodeExporter commented 8 years ago
Here is the compilation error (I mistakenly referred to it as a syntax error 
previously) that occurred: "Could not infer type for variable '$$62'. 
[AlgebricksException]".

Original comment by cmcintyr...@gmail.com on 29 Jul 2013 at 7:01

GoogleCodeExporter commented 8 years ago

Original comment by westm...@gmail.com on 31 Jul 2013 at 12:42

GoogleCodeExporter commented 8 years ago

Original comment by westm...@gmail.com on 2 Aug 2013 at 5:55

GoogleCodeExporter commented 8 years ago

Original comment by westm...@gmail.com on 2 Aug 2013 at 6:12

GoogleCodeExporter commented 8 years ago
Yingyi - this has been open since Aug 2013 - about to be a blocker at Oracle 
Labs - could you please take this from Vinayak?

Original comment by dtab...@gmail.com on 14 May 2014 at 6:40

GoogleCodeExporter commented 8 years ago
I verified this query on the current setting, we have (4NC - 16 IO-Device) with 
SF=10 and SF=100 (which is different from the original specification in the 
issue) and it goes through successfully. 

Original comment by pouria.p...@gmail.com on 6 Dec 2014 at 2:11

GoogleCodeExporter commented 8 years ago
Nice!!!

Original comment by buyingyi@gmail.com on 6 Dec 2014 at 2:15

GoogleCodeExporter commented 8 years ago
Update (on AsterixDB Beta 0.8.6 ):

- q22.aql.txt works fine on SF=10 and SF=100 with 4NC/16-IODevice.
- q22_mod.aql breaks with the following error message, during the query 
compilation and jobGen:

Could not infer type for variable '$$59' - [AlgebricksException]

Original comment by pouria.p...@gmail.com on 6 Dec 2014 at 2:18

GoogleCodeExporter commented 8 years ago
- This query fails with the same Algebricks exception on the latest master 
(8.7-SNAPSHOT) as well.
- Please note that the query version that we have under "runtimets" tests is a 
simplified version that works fine (In the simplified version, we drop the 
predicate that checks customers' phone numbers against a specific range of 
country codes). 
- For the full version you can check q22_mod.aql attachment above (which is 
also available in the code-base under 
"asterix-benchmarks/src/main/resources/tpc-h/queries".
- Query fails during the compilation phase, so for reproducing you just need to 
run the DDL (no loading is needed).  

Original comment by pouria.p...@gmail.com on 11 Dec 2014 at 12:39

GoogleCodeExporter commented 8 years ago
Thanks, Pouria! I'll look into it.

Original comment by buyingyi@gmail.com on 11 Dec 2014 at 12:45

GoogleCodeExporter commented 8 years ago
If we write the same query in this way, it will work:

use dataverse tpch;

declare function q22_customer_tmp() {
  for $c in dataset('Customer')
  let $phone_substr := substring($c.c_phone, 1, 2)
  where $phone_substr = '13'
    or $phone_substr = '31'
    or $phone_substr = '23'
    or $phone_substr = '29'
    or $phone_substr = '30'
    or $phone_substr = '18'
    or $phone_substr = '17'
  return {
    "c_acctbal": $c.c_acctbal,
    "c_custkey": $c.c_custkey,
    "cntrycode": $phone_substr
  }
}

let $avg := avg(
  for $c in dataset('Customer')
  let $phone_substr := substring($c.c_phone, 1, 2)
  where $c.c_acctbal > 0.00
    and ($phone_substr = '13'
    or $phone_substr = '31'
    or $phone_substr = '23'
    or $phone_substr = '29'
    or $phone_substr = '30'
    or $phone_substr = '18'
    or $phone_substr = '17')
  return $c.c_acctbal
)

let $g := for $o in dataset('Orders')  
       group by $custkey := $o.o_custkey with $o
       return {
             "custkey": $custkey,
             "count": count($o)
       }

for $ct in q22_customer_tmp()
for $gi in $g
where $ct.c_acctbal > $avg
    and
        $gi.custkey=$ct.c_custkey
    and
        $gi.count= 0
group by $cntrycode := $ct.cntrycode with $ct
order by $cntrycode
return {
  "cntrycode": $cntrycode, 
  "numcust": count($ct),
  "totacctbal": sum(for $i in $ct return $i.c_acctbal)
}

I'm still working on a better/more efficient way to express the same query.

Original comment by buyingyi@gmail.com on 12 Dec 2014 at 6:44

GoogleCodeExporter commented 8 years ago

Original comment by buyingyi@gmail.com on 9 Jan 2015 at 6:05

GoogleCodeExporter commented 8 years ago

Original comment by buyingyi@gmail.com on 16 Jan 2015 at 5:41