jnwatson / py-lmdb

Universal Python binding for the LMDB 'Lightning' Database
http://lmdb.readthedocs.io/
Other
646 stars 106 forks source link

LMDB does not insert large numbers with append=True #280

Closed rickbeeloo closed 3 years ago

rickbeeloo commented 3 years ago

Affected Operating Systems

Affected py-lmdb Version

py-lmdb Installation Method

sudo pip install lmdb

Describe Your Problem

Inserting relatively small numbers works fine with append=True , for example [1,2,10,1000], however inserting the following array fails:

import lmdb

pack_item = lambda x: x.to_bytes(5, 'big')
arr = [1093240233, 1586861524, 1886861524 , 2977577409, 5943132554]

env = lmdb.open('test70001.lmdb', max_dbs = 1)
db = env.open_db(dupsort=True, integerkey=True, integerdup=True)
with env.begin(db=db, write=True) as t:
    for x in arr:
        status = t.put(pack_item(x), b'test', append=True)
        print(status)

This gives:

True
True
False
False
False

Notably, simpler arrays or append=False works fine.

Describe What You Expected To Happen

The insert using append=True to also work fine with larger numbers

jnwatson commented 3 years ago

Two issues:

First, you must not change the flags of the "main" database. I enforce this in the latest release. (You must specify a database name in open_db).

Second:

integerkey: If True, indicates keys in the database are C unsigned or size_t integers encoded in native byte order. Keys must all be either unsigned or size_t, they cannot be mixed in a single database.

If you change your encoding to little endian, it works fine.

rickbeeloo commented 3 years ago

Thanks! When I change to env.open_db(b'db', dupsort=True, integerkey=True, integerdup=True) and then loop through the data:

with lmdb.open('test70001.lmdb', readonly=True).begin().cursor() as c:
    for k,v in c:
        print(k,v)

It gives me: b'db' b'\x00\x00\x00\x00,\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x05\x00\x00\x00\x00\x00\x00\x00\x03\x00\x00\x00\x00\x00\x00\x00' rather than the:

b'\xa9\x85)A\x00' b'test'
b'\xd4\x95\x95^\x00' b'test'
b'\xd48wp\x00' b'test'
b'\xc19z\xb1\x00' b'test'
b'\x8a\x01=b\x01' b'test'

Or did you mean I should create a new database? Moreover, c.get(pack_item(1586861524) does not return any value there just None

rickbeeloo commented 3 years ago

Think you just gave an example in the other issue :) will check that out first!