hortonworks-spark / shc

The Apache Spark - Apache HBase Connector is a library to support Spark accessing HBase table as external data source or sink.
Apache License 2.0
552 stars 281 forks source link

Hbase column versioning #315

Open aapkitechtube opened 5 years ago

aapkitechtube commented 5 years ago

I have an use case where I am generating a row key based on the list of columns which are part of primary key columns. The feed file comes with same record multiple times. We need to maintain versioning in hbase.

The issue I am facing is here: I would like to specify a dataframe column as a timestamp here

def saveToHbase (catalog: String, df: DataFrame, timestamp: String): Unit = { df.write.options( Map(HBaseTableCatalog.tableCatalog -> catalog, HBaseRelation.TIMESTAMP -> timestamp, HBaseTableCatalog.newTable -> "5")) .format("org.apache.spark.sql.execution.datasources.hbase") .save() }