hortonworks-spark / shc

The Apache Spark - Apache HBase Connector is a library to support Spark accessing HBase table as external data source or sink.
Apache License 2.0
552 stars 280 forks source link

How to connect to Remote Hbase from spark/SHC #227

Open AbdullahServantOfAllah opened 6 years ago

AbdullahServantOfAllah commented 6 years ago

The example available connect to localhost by default. But If I have seperate Spark cluster and Hbase cluster respectively. Then how to specify and connect to remote hb ase. Please give sample configuration and sample code as applicable.

weiqingy commented 6 years ago

@AbdullahServantOfAllah The configurations for the cases of that Spark and Hbase are in different clusters are the same as the cases of both Spark and Hbase are in the same cluster. You may want to copy the hbase-site.xml from the hbase cluster to the folder spark/conf/.

AbdullahServantOfAllah commented 6 years ago

@weiqingy , Thanks for the info, just copying hbase-site.xml is sufficient right? I dont need to specify in code(or configuration) the hbase's zookeeper url and port right?

weiqingy commented 6 years ago

No, you don't need to specify anything in the code. The configurations in the hbase-site.xml should be enough for SHC/Spark to identify your Hbase cluster. Spark reads the configurations specified in hbase-site.xml before starting running jobs.