Closed GoogleCodeExporter closed 9 years ago
attached.
Original comment by alecflett@google.com
on 9 Jul 2012 at 7:11
Attachments:
FWIW, I don't know much about the Python C API, so I'm guessing I'm messing up
the object ownership here by holding onto the name in the comparator. I'm
providing this patch merely as a starting point.
Original comment by alecflett@google.com
on 9 Jul 2012 at 7:12
This is something along the lines I was hoping for.
I'll just have the comparator default to "bytewise", perhaps add a custom
example, and include instructions on how to create more.
Original comment by arnim...@gmail.com
on 9 Jul 2012 at 8:06
Implemented comparator name for opening databases. Will port this to RepairDB()
as well.
Original comment by arnim...@gmail.com
on 10 Jul 2012 at 12:50
Thanks for starting this - I'm still hoping to be able to specify a comparator
name, even if I get the bytewise comparator as a result - the problem is
leveldb will refuse to open the database if the names don't match exactly - I'm
trying to use pyleveldb to examine a leveldb created by a c++ program, and
until I fully port the comparator over (it has some gnarly dependencies) I'd
like to at least let pyleveldb open it (that's what I did with the patch I have
here)
Original comment by alecflett@chromium.org
on 10 Jul 2012 at 8:25
I may have misunderstood you. You just wanted a way to override the Name()
function for a given comparator (bytewise in your patch), right?
If you´re into that territory, why not just use the bytewise comparator, and
disable the name check? Hopefully you´re not interested in mutating the
database, or getting keys in any sensible order.
And I looked the the comparator from chrome IndexedDB. Good luck with that :)
Original comment by arnim...@gmail.com
on 11 Jul 2012 at 12:04
there's no way to disable the name check in leveldb (as best as I can tell
anyway) - at least not without patching it.. I'm not mutating the database,
just doing one big scan.
I could actually write comparator fairly trivially in Python if it were
possible to actually implement a comparator in pyleveldb .. I know that's a lot
more work though.
Original comment by alecflett@chromium.org
on 11 Jul 2012 at 12:09
that´s not a bed test case.
there are a couple of implementation details that need to be worked out:
comparator exception propogation and conditional GIL releasing. nothing too
serious. you´d just pass a callable object, and a name.
GIL can´t be released for mutation or reading, that´s about the only
difference.
Original comment by arnim...@gmail.com
on 11 Jul 2012 at 12:26
I´m close to a prototype of this, but have hit a snag.
The Compare() function can be called from pretty much any thread, including the
compaction thread. Calling the Python callback from there is not a problem.
The problem comes when the Python code raises an exception, and we´re left
with no comparison result. The obvious solution would be to raise a C++
exception, but I'm worried that it would break some invariants, e.g. not
releasing mutexes.
Returning an arbitrary comparison value, and just logging the error, might
leave the database in an inconsistent state, so that's not an option.
I'm gonna try the C++ exception solution, but I'm not optimistic that it will
work.
A solution of last resort would simply be to call abort(), after printing out
the Python stacktrace.
If anybody has other ideas, I'd very much like to hear them.
Original comment by arnim...@gmail.com
on 23 Jul 2012 at 11:30
I just committed a preliminary version supporting Python based comparators.
Any feedback appreciated.
I haven´t done extensive testing on this, but I´ll close the issue once I´m
confident the code works as advertised.
Original comment by arnim...@gmail.com
on 27 Jul 2012 at 12:03
great news! I'll try to get to this today. I've rewritten a lot of the
WebKit/IndexedDB comparator code in Python for other purposes, so it shouldn't
be too hard to hook up.
Original comment by alecflett@chromium.org
on 27 Jul 2012 at 4:56
ok, finally got to try this out. A few issues:
1) just passing the comparator name, means the code complains that you should
just pass a comparator name or a tuple. But a tuple works
2) I eventually get a segfault, after lots of successful comparisons - no stack
trace, just a hard crash. I verified that it wasn't my comparator throwing an
exception, either.
Original comment by alecflett@chromium.org
on 31 Jul 2012 at 5:22
Interesting. in my particular dataset, it very consistently crashes after 253
comparisons. (using db.RangeIter())
I sprinkled some printfs around, and the crash seems to be in the releasing of
the GIL inside PythonComparatorWrapper::Compare()
Original comment by alecflett@chromium.org
on 31 Jul 2012 at 5:43
Alright. I wrote a stress test to verify this, and succeeded in crashing it,
but it happens more randomly than for you test case.
Original comment by arnim...@gmail.com
on 31 Jul 2012 at 11:20
Think I figured this one out. I'm running a test-suite on a dual core machine,
but will get the chance to run it on a bigger machine tomrrow. Can you please
see if can break this?
Original comment by arnim...@gmail.com
on 1 Aug 2012 at 12:34
I just ran my code against this, wasn't able to make it crash at all, hitting a
ldb with 2800 rows... I'm guessing that was the fix.
Original comment by alecflett@chromium.org
on 1 Aug 2012 at 1:29
FWIW I'm on a 24 core machine... my script just goes through the ldb once
though, so it's just running for a few seconds.
Original comment by alecflett@chromium.org
on 1 Aug 2012 at 1:30
Original comment by arnim...@gmail.com
on 6 Aug 2012 at 11:24
Original issue reported on code.google.com by
alecflett@google.com
on 9 Jul 2012 at 7:10