verdict-project / verdict

Interactive-Speed Analytics: 200x Faster, 200x Fewer Cluster Resources, Approximate Query Processing
http://verdictdb.org
Apache License 2.0
248 stars 66 forks source link

Query fails when group-by contains two columns from different tables with same name #375

Open dongyoungy opened 5 years ago

dongyoungy commented 5 years ago

I have found this while working on #370 for Presto. However, the issue may or may not be pertained to Presto only.

A generated agg query fails like the following:

org.verdictdb.exception.VerdictDBDbmsException: Issued the following query: 
select vt1.custkey as custkey, vt1.name as name, sum(vt3.extendedprice * (1 - vt3.discount)) as agg0, vt1.acctbal as acctbal, vt4.name as name1, vt1.address as address, vt1.phone as phone, vt1.comment as comment, vt1.custkey as verdictdb_group_by0, vt1.name as verdictdb_group_by1, vt1.acctbal as verdictdb_group_by2, vt1.phone as verdictdb_group_by3, vt1.name as verdictdb_group_by4, vt1.address as verdictdb_group_by5, vt1.comment as verdictdb_group_by6, sum(vt3.extendedprice * (1 - vt3.discount)) as verdictdb_order_by0_0, vt2.verdictdbtier as verdictdb_tier_alias_881207_0, vt3.verdictdbtier as verdictdb_tier_alias_881207_1 
from tpch.tiny.customer as vt1, memory.coordinator_test_r7hbllgh.orders_scrambled as vt2, memory.coordinator_test_r7hbllgh.lineitem_scrambled as vt3, tpch.tiny.nation as vt4 
where ((((((((vt1.custkey = vt2.custkey) and (vt3.orderkey = vt2.orderkey)) and (vt2.orderdate >= (date '1992-01-01'))) and (vt2.orderdate < (date '1998-01-01'))) and (vt3.returnflag = 'R')) and (vt1.nationkey = vt4.nationkey)) and (vt2.verdictdbblock = 4)) and (vt3.verdictdbblock >= 0)) and (vt3.verdictdbblock <= 4) 
group by vt1.custkey, vt1.name, vt1.acctbal, vt1.phone, vt1.name, vt1.address, vt1.comment, vt2.verdictdbtier, vt3.verdictdbtier

Query failed (#20190531_152945_01841_cdvib): line 1:127: 'vt4.name' must be an aggregate expression or appear in GROUP BY clause

The test query was a modified Q10 in TPC-H (which is currently in WIP for #370):

select
    c.custkey,
    c."name",
    sum(l.extendedprice * (1 - l.discount)) as revenue,
    c.acctbal,
    n."name",
    c.address,
    c.phone,
    c."comment"
from
    TPCH_SCHEMA.customer c,
    SCRAMBLE_SCHEMA.orders o,
    SCRAMBLE_SCHEMA.lineitem l,
    TPCH_SCHEMA.nation n
where
    c.custkey = o.custkey
    and l.orderkey = o.orderkey
    and o.orderdate >= date '1992-01-01'
    and o.orderdate < date '1998-01-01'
    and l.returnflag = 'R'
    and c.nationkey = n.nationkey
group by
    c.custkey,
    c."name",
    c.acctbal,
    c.phone,
    n."name",
    c.address,
    c."comment"
order by
    revenue desc