jnwatson / py-lmdb

Universal Python binding for the LMDB 'Lightning' Database
http://lmdb.readthedocs.io/
Other
614 stars 96 forks source link

build/lib/mdb.c:2436: Assertion 'mp->mp_pgno != pgno' failed in mdb_page_touch() #328

Closed SokolovJek closed 10 months ago

SokolovJek commented 1 year ago

Affected Operating Systems

Affected py-lmdb Version

lmdb.version=1.2.1, 1.3.0

py-lmdb Installation Method

pip install lmdb==1.2.1/pip install lmdb==1.3.0

Distribution name and LMDB library version

print(lmdb.version()) = (0, 9, 29)

Machine "free -m" output

              total        used        free      shared  buff/cache   available
Mem:            636         434          74           5         128         171
Swap:           511         215         296

Describe Your Problem

The lmdb database is used in the moonraker framework, Moonraker is a Python 3 based web server that provides APIs that client applications can use to interact with the Klipper 3D printing firmware. This is the lyrics, but the problem. When inserting data into the database, it sometimes (very rarely) stops working. The error "build/lib/mdb.c:2436: Assertion 'mp->mp_pgno != pgno' failed in mdb_page_touch()" appears, which completely blocks the server. I wanted to handle this exception but all my attempts were unsuccessful. Do you have any advice what should I do and how can I catch the exception? The database stops working specifically for inserting data, the data is read successfully. I traced where exactly the problem occurs, this is the method _insert_record():

def _insert_record(self, namespace: str, key: str, val: DBType) -> bool:
        db = self.namespaces[namespace]
        if val is None:
            return False
        with self.lmdb_env.begin(write=True, buffers=True, db=db) as txn:
            ret = txn.put(key.encode(), self._encode_value(val))
        return ret

in the code block "ret = txn.put(key.encode(), self._encode_value(val))". The key.encode() and self._encode_value(val) methods work, but txn.put() does not! Trying to catch the exception here also failed.

Errors/exceptions Encountered

build/lib/mdb.c:2436: Assertion 'mp->mp_pgno != pgno' failed in mdb_page_touch()

Describe What You Expected To Happen

try:
     self.insert_item("steapp", "database.debug_counter", debug_counter)
except Exception:
     print('ERROR')
finally:
     print('finally - ERROR')

Describe What Happened Instead

instead of the expected "ERROR" or 'finally - ERROR', I get "build/lib/mdb.c:2436: Assertion 'mp->mp_pgno != pgno' failed in mdb_page_touch()". In general, I want to delete the existing database in the "except" block and create a new one.

 except Exception as e:
            logging.error(f'Error in database.py. Dont can load db, created new. Error {e}')
            shutil.rmtree(self.database_path, ignore_errors=True)
            os.mkdir(self.database_path)
            self.lmdb_env = lmdb.open(self.database_path, map_size=MAX_DB_SIZE, max_dbs=MAX_NAMESPACES)

What should I do? Please, help!

jnwatson commented 1 year ago

This indicates database corruption. Can you reliably reproduce it?

This can happen if you disable locking or fork the process and manipulate the database in both processes. Are you doing either?

SokolovJek commented 1 year ago

Good afternoon! I can not recreate the process of simultaneous recording. I do not rule out the possibility that several processes can work with the database, and most likely for this reason it breaks. But I cannot intervene in the logic of the server to prevent this. In my case, the option that suits me is to catch the exception, and delete the database and re-create it. But I can't catch the exception. There is a copy of the faulty database, if there is interest or desire I can send it. Is it possible to catch the exception?

jnwatson commented 1 year ago

Given the assertion comes from C, it is relatively difficult to capture and recover. Here's a StackOverflow discussion.

Essentially, a C assert causes a SIGABRT, which you can handle by adding a signal handler.