forcedotcom / phoenix

BSD 3-Clause "New" or "Revised" License
558 stars 227 forks source link

Optimize MultiKeyValueTuple.getValue(byte[] family, byte[] qualifier) #654

Closed jtaylor-sfdc closed 10 years ago

jtaylor-sfdc commented 10 years ago

Couple of different options:

  1. Create a Map<ColumnReference, KeyValue> internally and use that to find a KeyValue by cf:cq.
  2. Optimize the KeyValueUtil.getColumnLatest by not creating a new KeyValue search term each time.

Thanks for pointing this out, @lhofhansl

lhofhansl commented 10 years ago

I did a test with a custom comparator that makes use of the fact the the JDK only passes the search term to the right side of the comparator during a binary search (that let's us to (1) avoid creating a KV for each search (2) avoid alternate instanceofs and (3) inline the search condition into the comparator itself.

A microbenchmark shows that this cuts search time by 40-50% (for KVs with 30byte rowKeys, 5byte families, and 20byte qualifiers). The saving will be greater for larger KVs.

When I'll get time I'll add that to Phoenix, test, and send a pull request.

jtaylor-sfdc commented 10 years ago

Thanks, @lhofhansl. Looking forward to the patch.

jtaylor-sfdc commented 10 years ago

@lhofhansl implemented this through a custom comparator. Thanks for the contributions!