vhbit / lmdb-rs

Rust bindings for LMDB
MIT License
114 stars 46 forks source link

-O3 and SEGV #40

Closed ArtemGr closed 7 years ago

ArtemGr commented 7 years ago

Just a heads up in case somebody else is struggling with the same issue.

After a recent upgrade I've started seeing crashes in LMDB, they looked like this

#0  0x00007fad06517cfc in mdb_cursor_put () from /usr/local/lib/libbyzon.so
#1  0x00007fad0651af53 in mdb_put () from /usr/local/lib/libbyzon.so
#2  0x00007fad06506698 in set_value_with_flags (self=<optimized out>, db=<optimized out>, key=...,
    value=..., flags=<optimized out>)
    at /home/grank/.cargo/registry/src/github.com-1ecc6299db9ec823/lmdb-rs-0.7.2/src/core.rs:1069
#3  set_value (self=<optimized out>, db=<optimized out>, key=..., value=...)
    at /home/grank/.cargo/registry/src/github.com-1ecc6299db9ec823/lmdb-rs-0.7.2/src/core.rs:1061
#4  set (self=<optimized out>, db=2, key=..., value=...)
    at /home/grank/.cargo/registry/src/github.com-1ecc6299db9ec823/lmdb-rs-0.7.2/src/core.rs:1079
#5  lmdb_rs::core::{{impl}}::set (self=<optimized out>, key=..., value=...)
    at /home/grank/.cargo/registry/src/github.com-1ecc6299db9ec823/lmdb-rs-0.7.2/src/core.rs:414

and only happened in the release builds. I've tried a couple of things, like adding global locks around LMDB operations and such, I've upgraded LMDB, but to no avail.

Finally I've lovered the optimization level in order to better debug the issue and voila, the problem vanished.

My conjecture is that LMDB is normally tested with -O2 optimization (cf), but the Rust package uses -O3 (cf), the level which is not supported by the LMDB code base and triggers some kind of a bug.

vhbit commented 7 years ago

@ArtemGr first of all, sorry for a long delay Maybe that's related to changes to Rust compiler and migration to MIR. Can you provide a shorter test case for your issue? For example, a single write to DB with type you're trying to store?

ArtemGr commented 7 years ago

@vhbit

first of all, sorry for a long delay

NP! I wasn't waiting for anything, as I have found the solution/workaround already.

Can you provide a shorter test case for your issue? For example, a single write to DB with type you're trying to store?

I don't really have much of issue. I've lowered the C optimization level to the -O2 which is a normal level for LMDB and the library is working without problems.

As a C++ programmer with almost two decades of experience I know first-hand that a lot of C libraries would fail and crash in unexpected ways when using optimization level higher than the one used by the developers. It's normal. Hunting those kinds of bugs takes time, especially when they happen sporadically and in the middle of something complex. I don't see a reason to fight for -O3 to work when the LMDB developers themselves choosed to stay with -O2.

So no, sorry, I won't be working on a test case.

I will be happy with you either closing this issue (the point of it being a heads up for anyone who might have a similar problem; with a bit of a search they should be able to find it in the issue archives). Or maybe even considering to lower the C optimization to -O2 to match the LMDB defaults.

vhbit commented 7 years ago

@ArtemGr, well, the thing is that it's not clear if it's a problem in LMDB itself or Rust bindings. Maybe its lmdb-rs code built with -O3 causes problems and that should be fixable.

May I ask you to check if using lmdb-rs from sys-opt-level + enabling back default release optimizations for you lib/bin solves the problem? If it doesn't it will indicate that it's lmdb-rs problem.

ArtemGr commented 7 years ago

May I ask you to check if using lmdb-rs from sys-opt-level + enabling back default release optimizations for you lib/bin solves the problem? If it doesn't it will indicate that it's lmdb-rs problem.

This is wat I did in the first place. And I gave a link to a similar patch in the very first message of the issue, except I'm using "-Og" there.

Okay, I switched to sys-opt-level. Will write back to you if I suddenly have any crashes.

vhbit commented 7 years ago

@ArtemGr thanks!

vhbit commented 7 years ago

@ArtemGr so how does it perform? Any crashes yet?

ArtemGr commented 7 years ago

None.

vhbit commented 7 years ago

@ArtemGr 0.7.4 has the fix, hope it will never resurface.

ArtemGr commented 7 years ago

0.7.4 has the fix

Great! You've made the right choice. =

hope it will never resurface

LMDB is not the most reliable technology, I've seen it crash on certain use patterns back in my C/C++ days. That particular bug since have been fixed, but it's not the only one I've seen in their mailing list and the presence of such bugs in the past is often a predictor saying that they are more likely to happen in the future. So, in my world, using LMDB is a risk. I would've used a different database if there was one with the inter-process profile of LMDB. (Unfortunately, most other fast embedded KVs are single-writer-process only).

But with -O2 we are at least having the same experience as the rest of the LMDB userbase and not playing a herd of cows sent to demine a minefield.