jankotek / mapdb

MapDB provides concurrent Maps, Sets and Queues backed by disk storage or off-heap-memory. It is a fast and easy to use embedded Java database engine.
https://mapdb.org
Apache License 2.0
4.9k stars 872 forks source link

JVM crash with large mmap file (too many handles) #723

Open jankotek opened 8 years ago

jankotek commented 8 years ago

This is older issue, but needs to be addressed. Workaround is to use larger DBMaker.allocationIncrement(), default value is one handle per 1MB.

Other option is t increase maximal file handle count. Default value on Ubuntu is 64K or some other very low number.

comment from blog:

I have observed JVM crash in case when I was just opening my mapDB maps for reading (in READ mode)

I have created a sample program to see how JVM behaves with too many mmaps and I have opened a RandomAccessFile in READ_ONLY mode and started mapping its regions and I got it replicated. What I figured out is that OS has some limits on how many mmaps I can do. By default on my 64-bit Ubuntu machine this limit is 65536 (and I can change it also, more info at http://www.linuxforums.org/for...

This limitation is not on how much area I map but is on the number of mmap I do as for a 40 GB file if I do 40 mmaps of 1 GB each, this is perfectly fine but if I try to do 1Kb mmap then JVM got crashed. I was looking for way if I can control how many mmaps mapDB d oes when I open a map. In my application I have more than 700 maps (multiple DBs, on avg each DB has 4 maps, total size on disk is nearly 60 GB and my machine has 128 GB RAM) which I am opening all together. Each map has memoryMapping enabale and I don't want to disable it for performance reasons.

I have very thin knowledge about this that mapDB does some 1MB block mapping and I was looking for more knowledge around how many mmaps the code internally does for each map open request and is there a way I can tweak this number to avoid my jvm crash? Below is my sampl e code which caused jvm crash (not using map DB but just to show how memory-mapping may a reason here too)

public static void main(String[] args) throws Exception 
{
RandomAccessFile inStream = new RandomAccessFile(new File("/home/kapil/MyProject/myFile3.p"), "r");
FileChannel ch = inStream.getChannel();
long BLOCK_SIZE = 1024;
ArrayList<bytebuffer> mappedBuffers = new ArrayList<bytebuffer>();
for(long i=0; i<1000000; i++)
{
//System.out.println(i);
Mapp edByteBuffer bf = ch.map(MapMode.READ_ONLY, i*BLOCK_SIZE, BLOCK_SIZE);
mappedBuffers.add(bf);
}
for(ByteBuffer bf : mappedBuffers)
{
System.out.println(bf.toString());
}
ch.close();
inStream.close();
System.out.println("..Done");
}
scottcarey commented 8 years ago

I had this problem as well, and increased the number of allowed mem map regions for my OS.

However, the results of having so many regions is that the performance is worse than disabling memory mapping on 1.x.

My testing was with about 300GB of data (in ~ 2000 BTreeMaps across ~500 dbs), on a server with 32GB RAM.

Some of my dbs are smaller, and take up at minimum 1MB disk space even if nearly empty. It would be nice if these grew exponentially, up to some limit, for example:

4k -> 64k -> 1M -> 4M -> 8M -> 16M -> (+16M, for each next chunk).

So that small dbs don't take up too much space, and larger ones don't have too many mapped regions.

It may also be possible to grow in smaller chunks, but when re-loading an existing db or after compact map it into much larger chunks.