Closed GoogleCodeExporter closed 8 years ago
PS - This could be a very soon after beta feature request What steps will reproduce the problem? Start asterix one CC and NC using managix, and run these statements from Web UI. Shouldn't this query be treated as a left outer join that performs indexed nested loop join, since I give a hint to optimizer to use the secondary B-Tree index defined on the name field of orders dataset ? Instead I see the system handles this as a hybrid hash join. //Query to get customer-id's of Alaskan customers drop dataverse test if exists; create dataverse test; use dataverse test; create type TestType as open { id:int32, name:string, state:string } create dataset customers(TestType) primary key id; create dataset orders(TestType) primary key id; create index idx01 on orders(name); insert into dataset customers({"id":65,"name":"John Doe","state":"Alaska"}); insert into dataset customers({"id":31,"name":"John","state":"TX"}); insert into dataset customers({"id":10,"name":"Smith","state":"CA"}); insert into dataset customers({"id":75,"name":"Roger","state":"MA"}); insert into dataset orders({"id":34,"name":"Henry","state":"OR"}); insert into dataset orders({"id":35,"name":"Kyle","state":"AZ"}); insert into dataset orders({"id":45,"name":"Kevin","state":"WA"}); insert into dataset orders({"id":65,"name":"John Doe","state":"Alaska"}); for $l in dataset customers return { "cust-id":$l.id, "state":for $m in dataset orders where $l.name /* +indexnl */ = $m.name return $m.state } Here is the optimized query plan. distribute result [%0->$$3] -- DISTRIBUTE_RESULT |PARTITIONED| exchange -- ONE_TO_ONE_EXCHANGE |PARTITIONED| project ([$$3]) -- STREAM_PROJECT |PARTITIONED| assign [$$3] <- [function-call: asterix:closed-record-constructor, Args:[AString: {cust-id}, %0->$$17, AString: {state}, %0->$$10]] -- ASSIGN |PARTITIONED| project ([$$17, $$10]) -- STREAM_PROJECT |PARTITIONED| exchange -- ONE_TO_ONE_EXCHANGE |PARTITIONED| group by ([$$11 := %0->$$14]) decor ([$$17 := %0->$$14]) { aggregate [$$10] <- [function-call: asterix:listify, Args:[%0->$$9]] -- AGGREGATE |LOCAL| select (function-call: algebricks:not, Args:[function-call: algebricks:is-null, Args:[%0->$$13]]) -- STREAM_SELECT |LOCAL| nested tuple source -- NESTED_TUPLE_SOURCE |LOCAL| } -- PRE_CLUSTERED_GROUP_BY[$$14] |PARTITIONED| exchange -- ONE_TO_ONE_EXCHANGE |PARTITIONED| order (ASC, %0->$$14) -- STABLE_SORT [$$14(ASC)] |PARTITIONED| exchange -- HASH_PARTITION_EXCHANGE [$$14] |PARTITIONED| project ([$$9, $$13, $$14]) -- STREAM_PROJECT |PARTITIONED| exchange -- ONE_TO_ONE_EXCHANGE |PARTITIONED| left outer join (function-call: algebricks:eq, Args:[%0->$$15, %0->$$16]) -- HYBRID_HASH_JOIN [$$15][$$16] |PARTITIONED| exchange -- HASH_PARTITION_EXCHANGE [$$15] |PARTITIONED| project ([$$14, $$15]) -- STREAM_PROJECT |PARTITIONED| assign [$$15] <- [function-call: asterix:field-access-by-index, Args:[%0->$$0, AInt32: {1}]] -- ASSIGN |PARTITIONED| exchange -- ONE_TO_ONE_EXCHANGE |PARTITIONED| data-scan []<-[$$14, $$0] <- test:customers -- DATASOURCE_SCAN |PARTITIONED| exchange -- ONE_TO_ONE_EXCHANGE |PARTITIONED| empty-tuple-source -- EMPTY_TUPLE_SOURCE |PARTITIONED| exchange -- HASH_PARTITION_EXCHANGE [$$16] |PARTITIONED| project ([$$16, $$9, $$13]) -- STREAM_PROJECT |PARTITIONED| assign [$$13, $$9, $$16] <- [TRUE, function-call: asterix:field-access-by-index, Args:[%0->$$1, AInt32: {2}], function-call: asterix:field-access-by-index, Args:[%0->$$1, AInt32: {1}]] -- ASSIGN |PARTITIONED| project ([$$1]) -- STREAM_PROJECT |PARTITIONED| exchange -- ONE_TO_ONE_EXCHANGE |PARTITIONED| data-scan []<-[$$12, $$1] <- test:orders -- DATASOURCE_SCAN |PARTITIONED| exchange -- ONE_TO_ONE_EXCHANGE |PARTITIONED| empty-tuple-source -- EMPTY_TUPLE_SOURCE |PARTITIONED|
Original issue reported on code.google.com by khfaraaz82 on 15 May 2013 at 6:31
khfaraaz82
Original comment by khfaraaz82 on 17 May 2013 at 9:35
Original issue reported on code.google.com by
khfaraaz82
on 15 May 2013 at 6:31