nerdammer / spark-hbase-connector

Connect Spark to HBase for reading and writing data with ease
Apache License 2.0
297 stars 107 forks source link

Does the connector support scan with filtering on some column values? #34

Open trungtv opened 8 years ago

trungtv commented 8 years ago

Hello, I want to get some records based on some scan filters i.e.: column:something = "somevalue" https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/Filter.html Does the connector support this kind of operator? if yes, please give example.

Thank you very much,

nicolaferraro commented 8 years ago

Hi, filters are not supported. Just scan parameters (i.e. filters on the row id) can be set in the query. You can perform filtering at spark level (using filter), obviously, a less-efficient approach.

It should not be so difficult to add them, this can be a good idea for a contribution...

sachinjain024 commented 8 years ago

@nicolaferraro This is one of the important features in spark-hbase connector. I can help/contribute on this but may need your help. AFAIK we need to create DefaultSource which should extend SchemaRelationProvider and create an instance of HbaseRelation. Then we may need to define the implementation of buildScan method.

Since your existing implementation is not based on above approach as mentioned, I don'g get how to add support for pushdown filters. It would be good if you can help me with some pointers.

nicolaferraro commented 8 years ago

@sachinjain024 This connector still does not support Spark SQL, it would be a major improvement. We were talking about extending the HBaseReaderBuilder to add support for filtering rows on the basis of column values.

This could be done by adding some utility methods to the builder that ultimately will produce Filters for a Scan that can be passed to the runtime in some way.

sachinjain024 commented 8 years ago

@nicolaferraro Thanks for information. This is far away from what I am intending to use. Apologies for the confusion.

liuluheng commented 8 years ago

@nicolaferraro EXCITED