deephacks / lmdbjni

LMDB for Java
Apache License 2.0
204 stars 28 forks source link

Comparator support appears missing #27

Closed raybellis closed 9 years ago

raybellis commented 9 years ago

I need to be able to specify a custom comparator to be able to read data that is not stored in lexical order.

LevelDB's JNI interface has this, and I didn't spot that there's no apparent equivalent in LMDB JNI until just now :(

krisskross commented 9 years ago

I think its possible to expose this functionality. I'll have a closer look later today.

raybellis commented 9 years ago

Thanks very much! :) I'll be very happy to test, and help where I can. I have some familiarity with JNI, but not with hawtjni.

krisskross commented 9 years ago

Didn't find time to deliver a fix tonight but the solution seems straight forward.

One thing to note is that the comparator is called a huge number of times in any db operation which will degrade performance substantially. So using a non-native comparator may not be a good idea... depending on your use case of course :-)

http://www.openldap.org/lists/openldap-technical/201502/msg00117.html

raybellis commented 9 years ago

I get Howard's point, but leveldb's JNI has it, and LMDB's not having it is a bit of a deal breaker for my using LMDB with Java. In my case it's not particularly performance sensitive - I only need to do a single seek call to find the closest matching key and then iterate from that point onwards. Thanks again :)

krisskross commented 9 years ago

Sorry about the delay. I haven't been able to find time to work on this yet. Hopefully next week.

raybellis commented 9 years ago

Thanks for the update :)

krisskross commented 9 years ago

My initial idea was to let users provide a class which has a method signature which looks like this. The arguments are MDB_val pointers as specified by the LMDB API.

public static int compare(long ptr1, long ptr2) {
  return 1;
}

A pointer to this method can fetched using the following new method in JNI.java.

@JniMethod(flags={MethodFlag.JNI, MethodFlag.POINTER_RETURN}, cast="jmethodID")
public static final native long GetStaticMethodID(
      @JniArg(cast="jclass", flags={POINTER_ARG}) Class clazz,
      String name,
      String signature);

Users would register their comparator method/class using this new method on Database.java.

public void setComparator(Transaction tx, Class comparatorClass) {
    JNI.mdb_set_compare(tx.pointer(), this.pointer(), JNI.GetStaticMethodID(comparator, "compare", "(JJ)I"));
  }

However, LMDB crashes as soon as a second item is inserted into the database and I haven't figured out why this happens yet. I'm gonna read up a bit more on JNI.

raybellis commented 9 years ago

It's possible that LMDB simply doesn't like a comparator that always returns 1. BTW, did you happen to look at the levelDB JNI implementation? I suspect that it's very similar to what would be required for LMDB. Thanks again...

krisskross commented 9 years ago

The comparator is not called at all so there is something missing I think. I did have a look at LevelDB implementation but they do some C++ stuff which seems unnecessary if I understand correctly.

Thanks for your tips. I'll investigate further.

krisskross commented 9 years ago

Good news. I managed to get LMDB to call a comparator using hawtjni-callback which seems like the right way to go. I haven't tested it too much yet but it looks promising :-)

Give a few more days to nail this.

raybellis commented 9 years ago

Great - thanks again for your help on this. I'm very happy to help test it when I get back to the office on Monday if you put it on a branch.

krisskross commented 9 years ago

No problem. What OS are you using?

raybellis commented 9 years ago

I'm using OS X and Linux.

krisskross commented 9 years ago

A first commit suggestion. This is not final. I will do more testing myself, just wanted to put it out in the open. Please try it out and feel free to provide any feedback you have.

krisskross commented 9 years ago

The compare callback from LMDB provide memory addresses which is the wrong level of abstraction for users. My initial suggestion is to provide DirectBuffers to the comparator. However, this is also risky since users may screw up byte ordering, SIGSEGV memory addresses and thinking the implementation is buggy etc. One suggestion is to provide something safer similar to the BufferCursor.

What's your thoughts on this?

raybellis commented 9 years ago

For my own use case I'd be happy with a pair of byte[], although longer term I guess something that doesn't involve a copy would be preferable.

krisskross commented 9 years ago

Good call. A pair of byte arrays is the safest way to go. If we want to go faster we can always pull out zero-copy. I'll add a byte array method.

raybellis commented 9 years ago

That looks perfect, thanks! I'll give it a thorough test on Monday :)

krisskross commented 9 years ago

Sweet! Let me know how it goes. I'll test our supported platforms in the meantime and if everything is green i'll make a release end of next week.

raybellis commented 9 years ago

Initial tests look good, but I have another issue that I'll raise a new ticket on that's impeding my debugging :+1:

raybellis commented 9 years ago

This is working fine for me, thanks! :+1: