bmatsuo / lmdb-go

Bindings for the LMDB C library
BSD 3-Clause "New" or "Revised" License
157 stars 59 forks source link

lmdb: add fdatasync and lmdb_debug build tags #76

Closed lmb closed 8 years ago

lmb commented 8 years ago

Enabling debugging requires a code change in mdb.c: mdb_debug is checked by the DPRINTF call, but never set in the codebase. Set mdb_debug to 1 statically until we figure out what is going on.

lmb commented 8 years ago

P.S. It's also possible to set MDB_DEBUG to 2 instead of one, which makes output more verbose. Maybe that would be worthwhile to add?

bmatsuo commented 8 years ago

Interesting. I do not want to make changes to mdb.c though. It will make updating the library more difficult and error prone.

It seems like the intention is for the programmer to set/clear mdb_debug and mdb_debug_start as extern variables in the application code. I assume that would be so that debug printing can be selectively enabled.

It also seems like MDB_DEBUG can be set as high as 3, which enables a bookkeeping audit on each commit. Though I don't very much like the idea of defining 3 build tags.

And I'm not totally convinced that the MDB_DEBUG stuff is very useful in the first place. If you are trying to nail down a bug in LMDB it might be a better idea to put together a minimal C program that can reproduce the error. The odds of getting something investigated by the openldap-technical mailing list are probably higher that way.

The MDBFDATASYNC stuff seems OK. Though I'm wondering if its a better practice to prefix build tags with "lmdb". I'm not sure... I'm not really aware of build tag naming conventions.

lmb commented 8 years ago

No, I think the debug mode is simply broken. mdb_debug is defined as static, which means it is private to that compilation unit. Happy to remove the debug flags though.

Also happy to rename the flags. Do you want pwritev changed was well?

bmatsuo commented 8 years ago

Zot! I did not notice the static qualifier. I guess I just didn't expect it to be there given how things were coded. Good catch. Indeed I do not think it's a good idea to support MDB_DEBUG for now. This situation may be worth bringing up on the openldap-technical mailing list. But maybe the static qualifier is just meant to signal that it truly an internal mechanism for LMDB developers to use. (edit: no, I'm not sure that makes any sense)

Because the pwritev tag already exists it's probably OK to just leave it as is for now. It really merely signals "this system has the pwritev system call", so leaving it as a general flag is probably not terribly harmful, though in some future release I might decide to change the tag (with the appropriate advanced notice).

AFAICT that is not the case for the fdatasync flags. I am having a hard time finding information about the kernel bugs which caused MDB_FDATASYNC_WORKS to be defined in the first place. The C code references the following URLs which do not show me any information about the bug/fixes.

I have to assume that there was a bug in fdatasync that specifically affected LMDB (instead of fdatasync being entirely broken and unusable). So I think it makes the most sense if the build tag for this to have an lmdb_ prefix.

How does that sound @lmb?

lmb commented 8 years ago

I did some digging:

Basically, fdatasync on ext3/ext4 was broken in certain cases. This is not specific to LMDB, but it leads to the durability guarantee going out of the window.

It is safe to use fdatasync in theses cases:

  1. You are not using LMDB on ext3/ext4
  2. You are on a kernel > 3.6-rc6 / that has the mentioned patch applied.

PS. I propose to add those two qualifications to the README, and call the flag fdatasync to be consistent.

lmb commented 8 years ago

Bump?

bmatsuo commented 8 years ago

Sorry for the delay. I didn't realize you were waiting for input from me. The proposed changes to the README sound good. And you still need to remove the MDB_DEBUG stuff, yes? Go ahead and do that. But I'm not ready to settle on the proposed name.

I think the name of the build tag should be prefixed with "lmdb" (so "lmdbfdatasync"). After looking at several other projects like etcd, I think it's the right choice to leave off the underscore '_'. It may end up causing people to declare seemingly redundant tags in rare cases, but I think it avoids more confusion than it causes. Because lmdb is a library I think it makes sense to namespace its tags so they don't collide in undesirable ways with the application or with other packages. I'm not sure how exactly that would happen for this tag but I would rather not really think about it now or in the future with other C build flags and the corresponding flags they would need.

It at least seems possible to me that some applications (or even libraries) using fdatasync on ext3/ext4 would not be affected by the bug you posted. If my understanding of the bug description is correct, applications which are guaranteed to write more than one page of data between calls to fdatasync would not experience in inconsistent flushing of data and metadata pages. Does that sound correct to you?

lmb commented 8 years ago

Had another look at the source code. Without the fdatasync flag LMDB does a runtime check to figure out what the right syscall is. I don't think its worth it optimizting that away (which is what the compile time flag does).

Closing this PR.