Closed jtaylor-sfdc closed 11 years ago
What time is available?
Can you mention a bit more how we can implement this functionality? From what I read http://hadoop-hbase.blogspot.com/2012/10/musings-on-secondary-indexes.html, it looks like it is not easy to maintain a consistent index globally. I noticed that Jesse did some work with Culvert. Is that what you planned to incorporate?
It'll be a phased approach: Phase 1 will include:
We've got some immutable data use cases where Map/Reduce jobs generate a Phoenix-compatible HFile and then the HFile is handed off to HBase. The above works fine for this scenario, since there's no incremental updates (i.e. the guaranteed consistency issue doesn't matter). We'd just generate one primary table HFile plus one HFile per index.
Phase 2 would include guaranteed consistent index maintenance. Jesse, Lars, and I have been talking about how to do this. We did some perf testing of their initial idea, but the batched gets to get back the data rows limits its usefulness (requiring a maximum selectivity of maybe 2-3%). Instead, Jesse and Lars have come up with a way of guaranteeing consistency at write time so that the reads aren't required to join back to the data table. They're still hashing through it a bit, but it seems to be gelling. Maybe we can get them to blog about it? @jyates @lhofhansl
Now,our index way is so: The index row iconsists of four parts :
We use the index metadata api to create and delete the index.We use the postPut method of the region observer to bulid the index, and the preDelete method of the region observer to delete the index data,and the postSplit of the region observer to control the index table and data table is split in the same time and update the index data.We use the postBalance method of the master observer to control the data region and the related index region in the same region server.
When a batch of data will be put or deleted,we use MR to build or delete indexes in the client-server way.
We use the endpoint way to query data with index.According to the query,we scan the index table to get the rows of data table.Then,we batch the gets with the rows to gain the query result.
May we join in the second index development?Thanks!
Thank you, @HaisenLee for your offer to help with secondary indexing. We've already pretty far into the implementation of phase 1, and phase 2 require hbase core functionality to be added which Jesse and Lars are working on. Would you be interested in helping with any of other open issues?
@jtaylor-sfdc Sure,I am pleasure to join in dealing the open issues.
Hi @jtaylor-sfdc I'm little curious why index talbe's rowkey need contain primary rowkey. Let's say there is a table called "EMP",this table has columns such as emp_name, age, dept_no ... and we want to create index on emp_name. assume index table is called IDX_EMP_emp_name. I think the mapping relation from EMP {e001, Tom, 31, d001} to IDX_EMP_emp_name{Tom, e001} should be pretty clear. I know there could be more than one "Tom" in the "EMP", because of that we need to save primary rowkey as qualifier name instead of value.
any thoughts ?
Here are the client-side changes to add secondary index support:
CREATE INDEX <index_name>
ON <table_name> (<column_ref> [ASC|DESC], ...)
INCLUDE (<column_ref>...) fam_properties
Make sure you reuse the rules we already have to parse table_name optionally including the schema name, column_ref as optionally including the column family name, and fam_properties to pass through properties for the HBase table and column families.This should take care of the parsing, compiling, and caching of the index metadata. For the usage of it, we'll need to do the following:
This should take care of the usage part of things. For the index maintenance, talk with @jyates. He's got the plumbing all worked out. Basically, there's an interface you need to implement where given the Put/Delete list from a data table, return the Put/Delete list for an index table. You'll also need to send over a bit of metadata in the Put/Delete operation to indicate any column qualifiers on the data table must be retrieved to build the Put/Delete list for the index. You likely need to send over the RowKeySchema too. Make sure that you delegate to a separate Phoenix class to figure out the list of mutations given a Put/Delete on the main table. The reason is that I'd like to provide a method in PhoenixRuntime, similar to getUncommittedData that gets the List<KeyValue> for each index table. This will provide a way of generating an HFile for map/reduce jobs for the index tables that'll be consistent with the data table.
Seems great @jtaylor-sfdc, is it in an branch/fork already? Looking forward to see how it was implemented.
For the maintenance side, I'm working on getting a patch into HBase (HBASE-8636) so indexing can supporting a compressed WAL (should be in 0.94.9). After that goes in, I'll send up a pull request to phoenix.
Its nothing too fancy though - there is no magic. Its just hacking the WAL entries to get durability, but otherwise only has a passing adherance to ACID - it only meets the HBase expectations. Coming soon - promise!
Request to @jyates - for the compressed WAL patch, can you make your index stuff not have a hard dependency on that? I'd like folks to be able to use the indexing with 0.94.4 and above. We can detect if it's pre 0.94.9 and throw if an index is attempted to be created on an HBase table that has compressed WAL enabled.
Yeah, I think we can do that - shouldn't be too hard.
Couple more additions, @tonyhuang. Instead of using a VARBINARY, just use the data row PK columns as the are. That way you can just remove any that are already in the index and you won't have any duplication.
Got you.
One other consideration on deciding between the "right" query plan to choose: you'll want to consider the ORDER BY clause as well. If we're ordering by the leading indexed columns, we'll definitely want to use that index, even if there's no start/stop row key formed for that index.
@mujtabachohan, @jyates, @simontoens, @elilevine, @anoopsjohn, @ramkrish86, @maryannxue, @ivarley @lhofhansl @ryang-sfdc @srau Phase 1 of secondary indexing is ready for testing. This basically includes:
If you want a little example, take a look at our tests here and here.
I think at a minimum we need more testing prior to releasing (#321). Nice to have would be #281, #337, and #336, but I'd be ok with a release that just hardens what's there now, since it helps in our initial use cases. Also, we have a lot of other great stuff that I'd like to get out there in a release as well.
Thoughts?
Hey James - that's awesome. We will give this a try!
@jtaylor-sfdc Looks like a sensor network simulation would be a good test case for what's in place now.
Do you guys have a framework for larger scale integration tests? Or any thoughts on one? I've done some recent work with benchpress. For testing secondary indexing and other features on clusters with some test data heft, and realistic application simulations, please consider opening issues for brainstorming of such, and so will I.
Support for secondary indexes over tables with immutable rows is in the 2.0.0 release. The incremental maintenance piece coming shortly will be traced by #336, so closing this issue.
Allow users to create indexes through a new CREATE INDEX DDL command and then behind the scenes build multiple projections of the table (i.e. a copy of the table using re-ordered or different row key columns). Phoenix will take care of maintaining the indexes when DML commands are issued and will choose the best table to use at query time.