meilisearch / heed

A fully typed LMDB wrapper with minimum overhead 🐦
https://docs.rs/heed
MIT License
604 stars 55 forks source link

Opening of database read-only must be followed by a RoTxn commit calls #183

Open darnuria opened 1 year ago

darnuria commented 1 year ago

Getting to data a Database opened with with Env::open_database on a process without a previous Env opened as Read Write held by the process lead to error "22 EINVAL", due to Env metadata not sync'ed in the Env at the end of the RoTxn openning transaction.

Steps to reproduce

Then

How to fix?

Committing the RoTxn that open of databases fix the issue. LMDB log an empty commit but not so empty! Why it fixes? The committing like said below synchronize the lmdb Env internal data about opened databases with the currents openings done with the RoTxn, the RoTxn is Read only but only in respect to the data in the databases not the Env in memory or the process.

Possible work around? Force RwTxn for database opening, maybe not desirable.

How it escaped Heed test suite?

The problem arise if the database is opened without prior Env about that database opened previously.

Context

I am in a setup with one process doing writes and read call it the Producer (no problem it's perfect) and an other doing only read the Reader.

The issue also apply when doing read-only access without other Env held inside the process.

In both case used the open_database

It leads to the following error:

Io(Os { code: 22, kind: InvalidInput, message: "Invalid argument" })'

The investigation

mdb_get checkout the txn->mt_numdbs on the reader it will be 2 because it's the stock value due to DB_CORE LMDB macro.

mdb_get L6821

    if (!key || !data || !TXN_DBI_EXIST(txn, dbi, DB_USRVALID))
        return EINVAL;

define TXN_DBI_EXIST

    /** Check \b txn and \b dbi arguments to a function */
#define TXN_DBI_EXIST(txn, dbi, validity) \
    ((txn) && (dbi)<(txn)->mt_numdbs && ((txn)->mt_dbflags[dbi] & (validity)))

define CORE_DBS

    /** Number of DBs in metapage (free and main) - also hardcoded elsewhere */
#define CORE_DBS    2

Where is set txn->mt_numdbs.

Here at least for my code-path in the reader: mdb_txn_renew0

    /* Setup db info */
    txn->mt_numdbs = env->me_numdbs;
    for (i=CORE_DBS; i<txn->mt_numdbs; i++) {
        x = env->me_dbflags[i];
        txn->mt_dbs[i].md_flags = x & PERSISTENT_FLAGS;
        txn->mt_dbflags[i] = (x & MDB_VALID) ? DB_VALID|DB_USRVALID|DB_STALE : 0;
    }
    txn->mt_dbflags[MAIN_DBI] = DB_VALID|DB_USRVALID;
    txn->mt_dbflags[FREE_DBI] = DB_VALID;

Ok but it should be right no? env->me_numdbs shall have the right value 3 here! It turns out the open_dbi does not update it, it updates the txn->mt_numdbs and on commit it will update the env->me_numdbs back!! ;)

Leading us to this little piece of code: mdb_dbis_update

    if (keep && env->me_numdbs < n)
        env->me_numdbs = n;

Called back mdb_txn_end

static void
mdb_txn_end(MDB_txn *txn, unsigned mode)
{
    MDB_env    *env = txn->mt_env;
#if MDB_DEBUG
    static const char *const names[] = MDB_END_NAMES;
#endif

    /* Export or close DBI handles opened in this txn */
    mdb_dbis_update(txn, mode & MDB_END_UPDATE);

And since RoTxn only expose mdb_txn_abort we cannot update the database number of the environment.

void
mdb_txn_abort(MDB_txn *txn)
{
    if (txn == NULL)
        return;

    if (txn->mt_child)
        mdb_txn_abort(txn->mt_child);

    mdb_txn_end(txn, MDB_END_ABORT|MDB_END_SLOT|MDB_END_FREE);
}

RoTxn::drop


impl Drop for RoTxn<'_> {
    fn drop(&mut self) {
        if !self.txn.is_null() {
            abort_txn(self.txn);
        }
    }
}

Using mdb_txn_commit even empty looks definitely better since it set correctly the parameters for synchronizing database openned to the Env LMDB struct.

Screenshot of the debugging session

Here a screenshot of a debugging session showing the issue:

In this session I opened 1 database (+ the 2 code DB managed by lmdb itself) so n = 3. image

And since RoTxn only mdb_txn_abort we cannot update the database number of the environment.

The logs

Added the following patch for debugging lmdb and producing the log

diff --git a/libraries/liblmdb/Makefile b/libraries/liblmdb/Makefile
index 942d0db..142989a 100644
--- a/libraries/liblmdb/Makefile
+++ b/libraries/liblmdb/Makefile
@@ -22,7 +22,7 @@ CC = gcc
 AR  = ar
 W   = -W -Wall -Wno-unused-parameter -Wbad-function-cast -Wuninitialized
 THREADS = -pthread
-OPT = -O2 -g
+OPT = -O0 -g
 CFLAGS  = $(THREADS) $(OPT) $(W) $(XCFLAGS)
 LDLIBS  = 
 SOLIBS  = 
diff --git a/libraries/liblmdb/mdb.c b/libraries/liblmdb/mdb.c
index 26b7cba..a3a40da 100644
--- a/libraries/liblmdb/mdb.c
+++ b/libraries/liblmdb/mdb.c
@@ -566,6 +566,7 @@ typedef MDB_ID   pgno_t;
      *  See struct MDB_txn.mt_txnid for details.
      */
 typedef MDB_ID  txnid_t;
+#define MDB_DEBUG 3

 /** @defgroup debug Debug Macros
  *  @{
@@ -575,12 +576,12 @@ typedef MDB_ID txnid_t;
      *  Set this to 1 for copious tracing. Set to 2 to add dumps of all IDLs
      *  read from and written to the database (used for free space management).
      */
-#define MDB_DEBUG 0
+#define MDB_DEBUG 3
 #endif

 #if MDB_DEBUG
-static int mdb_debug;
-static txnid_t mdb_debug_start;
+static int mdb_debug = 1;
+static txnid_t mdb_debug_start = 1;

     /** Print a debug message with printf formatting.
      *  Requires double parenthesis around 2 or more args.
@@ -6817,6 +6818,7 @@ mdb_get(MDB_txn *txn, MDB_dbi dbi,
     DKBUF;

     DPRINTF(("===> get db %u key [%s]", dbi, DKEY(key)));
+    DPRINTF(("READ txn %"Yu" mt_numdbs %u ", txn->mt_txnid, (txn)->mt_numdbs));

     if (!key || !data || !TXN_DBI_EXIST(txn, dbi, DB_USRVALID))
         return EINVAL;

Full logs including LMDB logs

cargo test -- env::tests::open_already_existing_database_multiprocess

   Compiling heed v0.20.0-alpha.2 (/heed/heed)
    Finished test [unoptimized + debuginfo] target(s) in 0.54s
     Running unittests src/lib.rs (target/debug/deps/heed-9bbbf71c6c2f5ce8)

running 1 test
mdb_env_open2:4955 new mdbenv
mdb_env_init_meta:4266 writing new meta page
mdb_env_open2:5033 opened database version 1, pagesize 4096
mdb_env_open2:5035 using meta page 0
mdb_env_open2:5036 depth: 0
mdb_env_open2:5037 entries: 0
mdb_env_open2:5038 branch pages: 0
mdb_env_open2:5039 leaf pages: 0
mdb_env_open2:5040 overflow pages: 0
mdb_env_open2:5041 root: 18446744073709551615
mdb_env_open:5669 opened dbenv 0x7f43e8000ed0
mdb_txn_begin:3283 begin txn 1w 0x7f43e8002750 on mdbenv 0x7f43e8000ed0, root page 18446744073709551615
mdb_page_search:6655 tree is empty
mdb_cursor_put:7637 ==> put db 1 key [73686172656430], size 7, data size 48
mdb_cursor_put:7698 allocating new root leaf page
mdb_page_new:8235 allocated new mpage 2, page size 4096
mdb_cursor_push:6114 pushing page 2 on db 1 cursor 0x7f43ef762840
mdb_cursor_put:7720 inserting key at index 0
mdb_node_add:8336 add to leaf page 2 index 0, data size 48 key size 7 [73686172656430]
mdb_txn_commit:4096 committing txn 1 0x7f43e8002750 on mdbenv 0x7f43e8000ed0, root page 2
mdb_cursor_put:7637 ==> put db 1 key [73686172656430], size 7, data size 48
mdb_page_search:6680 db 1 root page 2 has flags 0x12
mdb_page_search_root:6565 found leaf page 2 for key [73686172656430]
mdb_node_search:6000 searching 1 keys in leaf page 2
mdb_node_search:6045 found leaf index 0 [73686172656430], rc = 0
mdb_cursor_set:7269 ==> cursor placed on key [73686172656430]
mdb_page_search:6655 tree is empty
mdb_page_flush:3861 committing page 2
mdb_page_flush:3792 committing page 2
mdb_env_write_meta:4315 writing meta page 1 for root page 2
mdb_txn_end:3350 committed txn 1w 0x7f43e8002750 on mdbenv 0x7f43e8000ed0, root page 2
mdb_txn_begin:3283 begin txn 2w 0x7f43e8002750 on mdbenv 0x7f43e8000ed0, root page 2
mdb_page_search:6680 db 1 root page 2 has flags 0x2
mdb_page_search_root:6565 found leaf page 2 for key [73686172656430]
mdb_node_search:6000 searching 1 keys in leaf page 2
mdb_node_search:6045 found leaf index 0 [73686172656430], rc = 0
mdb_page_search:6655 tree is empty
mdb_cursor_put:7637 ==> put db 2 key [73686172656430], size 7, data size 17
mdb_cursor_put:7698 allocating new root leaf page
mdb_page_new:8235 allocated new mpage 3, page size 4096
mdb_cursor_push:6114 pushing page 3 on db 2 cursor 0x7f43ef7628b0
mdb_cursor_put:7720 inserting key at index 0
mdb_node_add:8336 add to leaf page 3 index 0, data size 17 key size 7 [73686172656430]
mdb_txn_commit:4096 committing txn 2 0x7f43e8002750 on mdbenv 0x7f43e8000ed0, root page 2
mdb_cursor_put:7637 ==> put db 1 key [73686172656430], size 7, data size 48
mdb_page_search:6680 db 1 root page 2 has flags 0x2
mdb_page_search_root:6565 found leaf page 2 for key [73686172656430]
mdb_node_search:6000 searching 1 keys in leaf page 2
mdb_node_search:6045 found leaf index 0 [73686172656430], rc = 0
mdb_cursor_set:7269 ==> cursor placed on key [73686172656430]
mdb_page_touch:2766 touched db 1 page 2 -> 4
mdb_page_search:6655 tree is empty
mdb_cursor_put:7637 ==> put db 0 key [0200000000000000], size 8, data size 16
mdb_cursor_put:7698 allocating new root leaf page
mdb_page_new:8235 allocated new mpage 5, page size 4096
mdb_cursor_push:6114 pushing page 5 on db 0 cursor 0x7f43ef762b70
mdb_cursor_put:7720 inserting key at index 0
mdb_node_add:8336 add to leaf page 5 index 0, data size 16 key size 8 [0200000000000000]
mdb_freelist_save:3582 IDL write txn 2 root 5 num 1
mdb_freelist_save:3585 IDL 2
mdb_page_search:6680 db 0 root page 5 has flags 0x12
mdb_page_search_root:6565 found leaf page 5 for key [null]
mdb_cursor_next:6946 cursor_next: top page is 5 in cursor 0x7f43ef762970
mdb_cursor_next:6954 =====> move to next sibling page
mdb_page_flush:3861 committing page 3
mdb_page_flush:3861 committing page 4
mdb_page_flush:3861 committing page 5
mdb_page_flush:3792 committing page 5
mdb_env_write_meta:4315 writing meta page 0 for root page 4
mdb_txn_end:3350 committed txn 2w 0x7f43e8002750 on mdbenv 0x7f43e8000ed0, root page 4
mdb_env_open2:5033 opened database version 1, pagesize 4096
mdb_env_open2:5035 using meta page 0
mdb_env_open2:5036 depth: 1
mdb_env_open2:5037 entries: 1
mdb_env_open2:5038 branch pages: 0
mdb_env_open2:5039 leaf pages: 1
mdb_env_open2:5040 overflow pages: 0
mdb_env_open2:5041 root: 4
mdb_env_open:5669 opened dbenv 0x7f43e8000ed0
mdb_txn_begin:3283 begin txn 2r 0x7f43e8003350 on mdbenv 0x7f43e8000ed0, root page 4
mdb_page_search:6680 db 1 root page 4 has flags 0x2
mdb_page_search_root:6565 found leaf page 4 for key [73686172656430]
mdb_node_search:6000 searching 1 keys in leaf page 4
mdb_node_search:6045 found leaf index 0 [73686172656430], rc = 0
mdb_cursor_set:7269 ==> cursor placed on key [73686172656430]
mdb_txn_end:3350 abort txn 2r 0x7f43e8003350 on mdbenv 0x7f43e8000ed0, root page 4
mdb_txn_begin:3283 begin txn 2r 0x7f43e8003350 on mdbenv 0x7f43e8000ed0, root page 4
mdb_get:6820 ===> get db 2 key [73686172656430]

mdb_get:6821 READ txn 2 mt_numdbs 2  

mdb_txn_end:3350 abort txn 2r 0x7f43e8003350 on mdbenv 0x7f43e8000ed0, root page 4
test env::tests::open_already_existing_database_multiprocess ... FAILED

failures:

---- env::tests::open_already_existing_database_multiprocess stdout ----
thread 'env::tests::open_already_existing_database_multiprocess' panicked at 'called `Result::unwrap()` on an `Err` value: Io(Os { code: 22, kind: InvalidInput, message: "Invalid argument" })', heed/src/env.rs:985:65
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

failures:
    env::tests::open_already_existing_database_multiprocess

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 16 filtered out; finished in 0.00s

error: test failed, to rerun pass `-p heed --lib`
darnuria commented 1 year ago

@Kerollmops worked like a charm on pre-production for us. :)