hortonworks-spark / shc

The Apache Spark - Apache HBase Connector is a library to support Spark accessing HBase table as external data source or sink.
Apache License 2.0
552 stars 280 forks source link

Structured Streaming read data from hbase regularly? #289

Open loveindemon opened 6 years ago

loveindemon commented 6 years ago

Hi,SHC Team I got some trouble when i writing my structured streaming application with shc. I got the new data from kafka source ,and I want to check whether the new data is different from the old one what has saved in hbase,if it is true,I will update these data with writing to hbase .So, how can i make structured streaming application read the old data from hbase by using shc in every batch? I tried to read from hbase: val df =sparkSession .read .options(Map(HBaseTableCatalog.tableCatalog -> catalog)) .format("org.apache.spark.sql.execution.datasources.hbase") .load() Then,I want to join the DataFrame from hbase with the DataFrame from kafka but, throw exception: java.lang.UnsupportedOperationException: Data source org.apache.spark.sql.execution.datasources.hbase does not support streamed reading What should i do? Thanks.

thila81 commented 5 years ago

Hi, Have you got an answer to this problem?