forcedotcom / phoenix

BSD 3-Clause "New" or "Revised" License
558 stars 227 forks source link

Phoenix not picking up the hint correctly. #608

Closed samarthjain closed 10 years ago

samarthjain commented 10 years ago

Phoenix is not picking up the right hint even when forced to do so:

Version details: HBase - hbase-0.94.9-security Phoenix - 2.1.1

If you execute the following on the squirrel client against a hbase cluster consisting of 1 region server -

DDL - create table eh (organization_id char(15) not null,parent_id char(15) not null, created_date date not null, entity_history_id char(15) not null constraint pk primary key (organization_id, parent_id, created_date, entity_history_id))

Query- explain select /+ RANGE_SCAN / ORGANIZATION_ID, PARENT_ID, CREATED_DATE, ENTITY_HISTORY_ID from eh where ORGANIZATION_ID='111111111111111' and SUBSTR(PARENT_ID, 1, 3) = 'foo' and TO_DATE ('2012-0-1 00:00:00') <= CREATED_DATE and CREATED_DATE <= TO_DATE ('2012-11-31 00:00:00') order by ORGANIZATION_ID, PARENT_ID, CREATED_DATE DESC, ENTITY_HISTORY_ID limit 100

Expected: CLIENT PARALLEL 1-WAY RANGE SCAN OVER EH ['111111111111111','foo','2011-12-01 00:00:00.000'] - ['111111111111111','fop','2012-12-01 00:00:00.000']\n" + " SERVER FILTER BY (CREATED_DATE >= 2011-11-30 AND CREATED_DATE <= 2012-11-30)\n" + " SERVER TOP 100 ROWS SORTED BY [ORGANIZATION_ID, PARENT_ID, CREATED_DATE DESC, ENTITY_HISTORY_ID]\n" +
Actual: CLIENT PARALLEL 1-WAY SKIP SCAN ON 1 RANGE OVER EH ['111111111111111','foo','2011-12-01 00:00:00.000'] - ['111111111111111','fop','2012-12-01 00:00:00.000'] SERVER TOP 100 ROWS SORTED BY [ORGANIZATION_ID, PARENT_ID, CREATED_DATE DESC, ENTITY_HISTORY_ID] CLIENT MERGE SORT

jtaylor-sfdc commented 10 years ago

Please confirm it doesn't work on 2.1.2 either. Also, run it locally and put a breakpoint in BaseQueryPlan.getScanner() to see if it really is using a skip scan (check the filters on the scan), as I fixed an explain plan bug before where it said it was using a skip scan, but it actually wasn't.

samarthjain commented 10 years ago

I see the same problem on 2.1.2 too.

CLIENT PARALLEL 1-WAY SKIP SCAN ON 1 RANGE OVER EH ['111111111111111','foo','2011-12-01 00:00:00.000'] - ['111111111111111','fop','2012-12-01 00:00:00.000'] SERVER TOP 100 ROWS SORTED BY [ORGANIZATION_ID, PARENT_ID, CREATED_DATE DESC, ENTITY_HISTORY_ID] CLIENT MERGE SORT

It is also missing the filter part -

SERVER FILTER BY (CREATED_DATE >= 2011-11-30 AND CREATED_DATE <= 2012-11-30)

Will update with the local debugging results soon.

samarthjain commented 10 years ago

Turns out that Squirrel was treating the hint specified in /+ / as a SQL comment.

This default can be changed in the (New) Session Properties. In File --> New Session Properties --> Tab SQL uncheck the "Remove multi line comment (/.../) from SQL before sending it to database" checkbox.

After doing that, the query plan was correctly displayed as:

CLIENT PARALLEL 1-WAY RANGE SCAN OVER EH ['111111111111111','foo','2011-12-01 00:00:00.000'] - ['111111111111111','fop','2012-12-01 00:00:00.000'] SERVER FILTER BY (CREATED_DATE >= 2011-11-30 AND CREATED_DATE <= 2012-11-30) SERVER TOP 100 ROWS SORTED BY [ORGANIZATION_ID, PARENT_ID, CREATED_DATE DESC, ENTITY_HISTORY_ID] CLIENT MERGE SORT

jtaylor-sfdc commented 10 years ago

Updated README.md to include the above. Thanks for following up on this.