wbolster / plyvel

Plyvel, a fast and feature-rich Python interface to LevelDB
https://plyvel.readthedocs.io/
Other
530 stars 75 forks source link

Python thread error when using a custom comparator #35

Closed darkjh closed 10 years ago

darkjh commented 10 years ago

I've just encounter this problem when inserting a stream of data using a custom comparator. Code is something like below:

def str_int_comparator(a, b):
    a = int(a)
    b = int(b)
    if a < b:
        return -1
    if a > b:
        return 1
    return 0

db = plyvel.DB(db_path, create_if_missing=True,
               comparator=str_int_comparator,
               comparator_name='str_int_comparator')
stream = get_the_stream(...)

before = time.time()
wb = db.write_batch()

# generate a kv stream
for i, line in enumerate(stream):
    if i > 0 and i % 50000 == 0:
        wb.write()
        wb = db.write_batch()
        print "Inserted upto {} ...".format(i)
    wb.put(str(i), json.dumps(line))

after = time.time()
used = after - before

I got this after 100k elements are inserted:

Inserted upto 50000 ...
Inserted upto 100000 ...
Fatal Python error: This thread state must be current when releasing
Aborted

If I use the default comparator, everything goes well. I have no idea the implementation but it seems to me that when LevelDB begins to compact files, the error occurs.

I'm using plyvel 0.8, LevelDB 1.15.0 and python 2.7.8.

wbolster commented 10 years ago

Thanks for the report. I'm trying it out, but I see something different happening...

Update: this particular comment should be ignored; see #36.

A slightly modified version of your code (only changed so that it actually runs) run on Python 3.3, LevelDB 1.17, and latest Plyvel (git master), quickly allocates lots of memory, and crashes like this:

$ python issue35.py db-issue-35
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted

GDB shows this traceback:

(gdb) run issue35.py db-issue-35
Starting program: /home/uws/.virtualenvs/plyvel/bin/python issue35.py db-issue-35
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc

Program received signal SIGABRT, Aborted.
0x00007ffff6af9407 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56  ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x00007ffff6af9407 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007ffff6afa7e8 in __GI_abort () at abort.c:89
#2  0x00007ffff6179be5 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007ffff6177c56 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007ffff6177ca1 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007ffff6177eb9 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007ffff61783d9 in operator new(unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007ffff61d7bc9 in std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) ()
   from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x00007ffff61d88ab in std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned long) ()
   from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9  0x00007ffff61d8954 in std::string::reserve(unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x00007ffff61d8bdf in std::string::append(char const*, unsigned long) ()
   from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#11 0x00007ffff645a2ec in leveldb::WriteBatch::Put(leveldb::Slice const&, leveldb::Slice const&) ()
   from /usr/lib/x86_64-linux-gnu/libleveldb.so.1
#12 0x00007ffff66afcd2 in __pyx_pf_6plyvel_7_plyvel_10WriteBatch_4put (__pyx_v_value=0xaed810, __pyx_v_key=0xae22e0, 
    __pyx_v_self=0x7ffff5cc7470) at plyvel/_plyvel.cpp:8527
#13 __pyx_pw_6plyvel_7_plyvel_10WriteBatch_5put (__pyx_v_self=0x7ffff5cc7470, __pyx_args=<optimized out>, 
    __pyx_kwds=<optimized out>) at plyvel/_plyvel.cpp:8407
#14 0x00000000004874ac in PyEval_EvalFrameEx ()
#15 0x00000000005a68cf in ?? ()
#16 0x00000000004693ed in PyRun_FileExFlags ()
#17 0x00000000004697ca in PyRun_SimpleFileExFlags ()
#18 0x000000000046be74 in Py_Main ()
#19 0x000000000047a858 in main ()
(gdb) quit
wbolster commented 10 years ago

With Python 2.7 I see the same as in your original report:

$ python issue35.py db-issue-35
Inserted upto 50000 ...
Inserted upto 100000 ...
Inserted upto 150000 ...
Inserted upto 200000 ...
Fatal Python error: This thread state must be current when releasing
Aborted

GDB:

(gdb) run issue35.py db-issue-35
Starting program: /home/uws/Projects/Plyvel/plyvel/.tox/py27/bin/python issue35.py db-issue-35
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Inserted upto 50000 ...
Inserted upto 100000 ...
[New Thread 0x7ffff5dce700 (LWP 21375)]
Fatal Python error: This thread state must be current when releasing

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff5dce700 (LWP 21375)]
sem_post () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_post.S:33
33  ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_post.S: No such file or directory.
(gdb) bt
#0  sem_post () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_post.S:33
#1  0x000000000052d299 in PyThread_release_lock ()
#2  0x00007ffff6b205d6 in PlyvelCallbackComparator::Compare (this=0x9f3e80, a=..., b=...) at plyvel/comparator.cpp:85
#3  0x00007ffff68cd5ea in leveldb::SomeFileOverlapsRange(leveldb::InternalKeyComparator const&, bool, std::vector<leveldb::FileMetaData*, std::allocator<leveldb::FileMetaData*> > const&, leveldb::Slice const*, leveldb::Slice const*) ()
   from /usr/lib/x86_64-linux-gnu/libleveldb.so.1
#4  0x00007ffff68cdc76 in leveldb::Version::PickLevelForMemTableOutput(leveldb::Slice const&, leveldb::Slice const&)
    () from /usr/lib/x86_64-linux-gnu/libleveldb.so.1
#5  0x00007ffff68bbbea in leveldb::DBImpl::WriteLevel0Table(leveldb::MemTable*, leveldb::VersionEdit*, leveldb::Version*) () from /usr/lib/x86_64-linux-gnu/libleveldb.so.1
#6  0x00007ffff68bcdea in leveldb::DBImpl::CompactMemTable() () from /usr/lib/x86_64-linux-gnu/libleveldb.so.1
#7  0x00007ffff68bdcd5 in leveldb::DBImpl::BackgroundCompaction() () from /usr/lib/x86_64-linux-gnu/libleveldb.so.1
#8  0x00007ffff68be932 in leveldb::DBImpl::BackgroundCall() () from /usr/lib/x86_64-linux-gnu/libleveldb.so.1
#9  0x00007ffff68e9f5b in ?? () from /usr/lib/x86_64-linux-gnu/libleveldb.so.1
#10 0x00007ffff7bc70a4 in start_thread (arg=0x7ffff5dce700) at pthread_create.c:309
#11 0x00007ffff6fdc04d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) quit
wbolster commented 10 years ago

I found the source of the problem. I'll prepare a fix shortly.

darkjh commented 10 years ago

@wbolster that's cool, thanks for you quick response. If you have the time, can you explain a little bit the bug?

wbolster commented 10 years ago

The problem is that the custom comparator is called from a background thread created internally inside LevelDB. Even if the application itself doesn't use threading, the Python threading machinery still needs to be initialized. Plyvel didn't do this.

See the commit message (and diff) for 635777d for details.

Oh, and since the unit tests use the threading module, which takes care of the initialisation the GIL automatically, this issue does not appear there.

wbolster commented 10 years ago

@darkjh btw, a comparator implemented in Python is really slow compared to the built-in comparator in LevelDB. Are you sure you can't pack your data in such a way that the default LevelDB behaviour matches your needs? In the case of integer keys, something like struct.pack('>Q', value) would work up to 64-bit numbers.

wbolster commented 10 years ago

Fwiw, I just released Plyvel 0.9, which includes this fix. See https://plyvel.readthedocs.org/en/latest/news.html for the changelog.

darkjh commented 10 years ago

Thanks for the fix. What I'm trying to do is kind of a external sort routine for some stream of data using levelDB. And I want to support list/tuple of string/int as the sort key. Yes I've found myself that custom comparator is slow, even if it's code with cython. As you said, I'm trying to find a way of encoding the sort key into a string in a order preserving way.

wbolster commented 10 years ago

For integers that have a known maximum value, something like ">Q" for a struct works fine. For strings it also works fine if you either have it as the last component (otherwise pad to fixed length).