bmatsuo / lmdb-go

Bindings for the LMDB C library
BSD 3-Clause "New" or "Revised" License
158 stars 59 forks source link

Full table scan does not seem to have latest set of write operations. #122

Closed prataprc closed 6 years ago

prataprc commented 6 years ago

Firstly, thanks for the awesome lmdb storage engine.

I am trying to use lmdb as the reference algorithm for some of the experiments that I am doing. One of them involve an left-leaning-red-black (LLRB) tree implementation.

This particular issue keep coming up with dbtest/ program where it does randomized testing of LLRB API, using LMDB as reference. It is easy to reproduce:

$ go get github.com/bnclabs/dbtest
$ cd <dbtest>
$ go get ./...

// To test llrb with lmdb as reference.
$ ./llrb.sh

llrb.sh invokes dbtest as:

$ ./dbtest -db llrb -load 1000000 -writes 4000000 -lsm

This command loads 1 million key,value entries (keysize:32 bytes, valuesize: 32 bytes) into llrb instance and lmdb instance. And then starts concurrent writer and reader routines that will access the llrb and lmdb instance and validate the result. This test will complete after executing 4 million write operations.

Meanwhile, there is validator routine that wakes up every 10 second, block all write operations, and execute a full table iteration on both llrb and lmdb instance, to check both indexes contain same set of entries and identically sorted. This full-table comparision is done by compareLlrbLmdb() function:

func compareLlrbLmdb( index *llrb.LLRB, lmdbenv *lmdb.Env, lmdbdbi lmdb.DBI)

Note that compareLlrbLmdb() uses lmdbscan/ sub-package to do full-table comparision. Here is the o/p of compareLlrbLmdb() function:

compareLlrbLmdb, lmdbcount:1038940 llrbcount:1046704 seqno:1093263
"00000000000000000000000000000555l\x00\x00\x00\x00\r\xa13" 893235 false true
"00000000000000000000000000000553l\x00\x00\x00\x00\x01\x04\x9e"
panic: expected "00000000000000000000000000000555","00000000000000000000000000000555l\x00\x00\x00\x00\r\xa13", got "00000000000000000000000000000553","00000000000000000000000000000553l\x00\x00\x00\x00\x01\x04\x9e" [recovered]
        panic: expected "00000000000000000000000000000555","00000000000000000000000000000555l\x00\x00\x00\x00\r\xa13", got "00000000000000000000000000000553","00000000000000000000000000000553l\x00\x00\x00\x00\x01\x04\x9e"

goroutine 10 [running]:
main.llrbvalidator.func3(0xc4201bbeb0)
        /Users/prataprc/myworld/devgo/src/github.com/bnclabs/dbtest/llrb.go:177 +0x5c
panic(0x4424ce0, 0xc420a3d370)
        /usr/local/Cellar/go/1.9/libexec/src/runtime/panic.go:491 +0x283
main.compareLlrbLmdb.func1(0xc420193600, 0x0, 0x0)
        /Users/prataprc/myworld/devgo/src/github.com/bnclabs/dbtest/compare.go:55 +0xe3a
github.com/bmatsuo/lmdb-go/lmdb.(*Txn).runOpTerm(0xc420193600, 0xc420a32660, 0x0, 0x0)
        /Users/prataprc/myworld/devgo/src/github.com/bmatsuo/lmdb-go/lmdb/txn.go:158 +0x7c
github.com/bmatsuo/lmdb-go/lmdb.(*Env).run(0xc42007f920, 0x4013100, 0x20000, 0xc420a32660, 0x0, 0x0)
        /Users/prataprc/myworld/devgo/src/github.com/bmatsuo/lmdb-go/lmdb/env.go:511 +0x129
github.com/bmatsuo/lmdb-go/lmdb.(*Env).View(0xc42007f920, 0xc420a32660, 0x46f93e0, 0x3)
        /Users/prataprc/myworld/devgo/src/github.com/bmatsuo/lmdb-go/lmdb/env.go:451 +0x43
main.compareLlrbLmdb(0xc4200c46e0, 0xc42007f920, 0x2)
        /Users/prataprc/myworld/devgo/src/github.com/bnclabs/dbtest/compare.go:30 +0x26a
main.llrbvalidator.func2.1(0xc4200c46e0, 0xc42007f920, 0xc400000002, 0xc4201bbee0, 0xc4201bbe90)
        /Users/prataprc/myworld/devgo/src/github.com/bnclabs/dbtest/llrb.go:166 +0x7d
main.llrbvalidator.func2()
        /Users/prataprc/myworld/devgo/src/github.com/bnclabs/dbtest/llrb.go:170 +0x19d
main.llrbvalidator(0xc42007f920, 0x100000002, 0x15122f6786ec1fc8, 0x44a3e3a, 0x6, 0xc42007f980, 0xc4201260e0, 0xc4200161e0)
        /Users/prataprc/myworld/devgo/src/github.com/bnclabs/dbtest/llrb.go:185 +0x1ea
created by main.dollrbrw
        /Users/prataprc/myworld/devgo/src/github.com/bnclabs/dbtest/llrb.go:96 +0x10d

One anomaly that comes up right away is that lmdb says 1038940 items are indexes while llrb says 1046704 items are indexed.

Is there a delay in flushing the recent set of write operations ? Is this something to do with read snapshot trailing behind the tip ?

I would greatly appreciate your help on this. Thanks in advance,

prataprc commented 6 years ago

There was a mistake in the test code. Closing this issue.