deephacks / lmdbjni

LMDB for Java
Apache License 2.0
204 stars 28 forks source link

Preventing MDB_MAP_FULL? #58

Closed bfreuden closed 8 years ago

bfreuden commented 8 years ago

I am wondering if you ever needed or tried to detect when an Env is getting close to its map size limit? I've been struggling with this for a few weeks, but without any real success so far. A few days ago I discovered MDB_txn::mt_next_pgno (next unallocated page), it looks promising. Anyway I would be happy if you could share your experience or thoughts in this area. Thanks!

krisskross commented 8 years ago

I have never had this problem myself. The best answer I could find regarding this was from Howard on the openldap mailing list.

http://comments.gmane.org/gmane.network.openldap.technical/11699

krisskross commented 8 years ago

What I usually do is set a really large size. LMDB won't allocate space until it's actually needed so in principle you can set the size to Long.MAX_VALUE.

bfreuden commented 8 years ago

Thank you so much for your fast answer and for the link! Unfortunately MDB_envinfo::last_pgno is only updated after commit (MDB_txn::mt_next_pgno seems to be reliable during the lifetime of a write transaction, but is it not part of the public API), and I've never been able to stat the free dbi (the implementation of the -f option of mdb_stat.c makes me think it isn't trivial). In your use cases, do you have things like hundreds of Envs allocated with a very high map size value (just in case it is needed)? Or only one (or a couple) of Envs? I must admit creating hundreds of 1TB (just in case) Envs is scaring me.

krisskross commented 8 years ago

I think your best bet regarding the internals of LMDB is to post your question to openldap-technical@openldap.org.

I haven't implemented any large scale storage yet, but I know that HustleDB [1] allocate a separate LMDB Env for every table and every column as a separate LMDB table. You can read more about the design here [2]. I also recall a recent discussion on the mailing list regarding number of Env/Dbs which might be useful to read [3].

[1] https://github.com/tspurway/hustle [2] https://groups.google.com/forum/#!topic/hustle-users/tGMHbci5_MA [3] http://www.openldap.org/lists/openldap-technical/201601/msg00016.html

krisskross commented 8 years ago

The anxiety you're describing seems more grounded in the storage requirements of your application rather than LMDB. There's no problem with over dimensioning as long as you don't hit that limit.

bfreuden commented 8 years ago

Thank you very much, I do appreciate a lot the time you are taking for answering me! I will definitely investigate on HustleDB. In discussion [3] that was me talking to Howard (and I do hope he didn't feel offended by my questions). I am afraid he is firmly opposed to the idea of checking if one is close to the limit. Maybe it's me having an odd mindset or being ignorant, but I feel more comfortable with a limit when I know exactly where I stand compared to it. My anxiety is more related to the one of the IT guys when they will be realizing my application seems to "consume" hundreds of gigabytes of RAM (if I manage to put LMDB into production). Thanks again!