hortonworks-spark / shc

The Apache Spark - Apache HBase Connector is a library to support Spark accessing HBase table as external data source or sink.
Apache License 2.0
552 stars 281 forks source link

Fail when ingesting data to table with orphan regions #346

Closed amommendes closed 1 year ago

amommendes commented 3 years ago

Summary

When we have tables with orphan regions, SHC raises the following exception: image

When ingesting data (with parameter newTable), the HBaseRelation class runs the method createTableIfNotExists. This method uses the HBaseAdmin isTableAvailable method to check if table is "available". However, as this method returnstrueonly if all table regions are available, it will return false in the case of table having problematic (e.g., orphan) regions, triggering the createTable method and after the Hbase prepareCreate procedure, which in turn raises the TableExistsException.

Proposed solution

Check if table can be created using another method, such as MetaTableAccessor.tableExists directly, instead checking regions, which is performed by ConnectionImplementation on HBase