Column qualifier for multiple columns in a column family.

hortonworks-spark / shc

The Apache Spark - Apache HBase Connector is a library to support Spark accessing HBase table as external data source or sink.

Apache License 2.0

552 stars 281 forks source link

I have a hbase table with multiple columns in a single column family let say "c". In order to save disk space this column family can be written to a single family qualifier and it will make only one row for a given rowkey. How can I achieve this using this library. I tried this catalog where I am trying to make "q_data" as my family qualifier.

{ "table":{"namespace":"default", "name":"spark_dense_string", "tableCoder":"PrimitiveType"}, "rowkey":"id:hist_timestamp", "columns":{ "id":{"cf":"rowkey", "col":"id", "type":"string","length":"36"}, "hist_timestamp":{"cf":"rowkey", "col":"hist_timestamp", "type":"string"}, "q_data":{"cf":"c","col":"value", "type":"double"}, "q_data":{"cf":"c", "col":"est_val", "type":"double"}, "q_data":{"cf":"c","col":"replaced", "type":"smallint"} } }

Thanks in advance!

{ "table":{"namespace":"default", "name":"spark_dense_string", "tableCoder":"PrimitiveType"}, "rowkey":"hist_timestamp", "columns":{ "id":{"cf":"rowkey", "col":"hist_timestamp", "type":"string","length":"36"}, "hist_timestamp":{"cf":"rowkey", "col":"hist_timestamp", "type":"string"}, "q_data":{"cf":"c","col":"value", "type":"double"}, "q_data":{"cf":"c", "col":"est_val", "type":"double"}, "q_data":{"cf":"c","col":"replaced", "type":"smallint"} } }

hortonworks-spark / shc

Column qualifier for multiple columns in a column family. #340