tolitius / cbass

adding "simple" to HBase
Eclipse Public License 1.0
24 stars 11 forks source link

Nippy compression constrains use cases #3

Closed chrishowejones closed 8 years ago

chrishowejones commented 8 years ago

cbass seems to use nippy compression on the stored values in hbase which is fine if you use cbass for everything. However most use cases I come across have different mechanisms to store and/or retrieve data that may not be compressing using nippy.

For example, I'm currently storing data using a Storm trident 'bolt' (actually persistPartition call) that just stores the values as raw byte arrays uncompressed. This means that I can't use cbass to read this data.

tolitius commented 8 years ago

@chrishowejones, good point, I need to add this do docs.

cbass have a pack-un-pack function where you can change how things are packed going to HBase and unpacked coming back. You are right, by default it is nippy in/nippy out, and this should be added to docs.

In case where you do all serialization / deserialization yourself, you can just mute it:

(pack-un-pack {:p identity :u identity})

let me know if it helps.

chrishowejones commented 8 years ago

@tolitius thanks. That worked. I'll extend pack-unpack to serialise and deserialise using org.apache.hadoop.hbase.util.Bytes/toBytes and the equivalent in reverse e.g. org.apache.hadoop.hbase.util.Bytes/toString, toLong etc.

tolitius commented 8 years ago

@chrishowejones glad it works for you. and thanks for bringing it up. I do use cbass for everything with a couple of current projects, hence I omitted the docs.

docs are now in.