realm / realm-java

Realm is a mobile database: a replacement for SQLite & ORMs
http://realm.io
Apache License 2.0
11.46k stars 1.75k forks source link

Out of memory: Be able to open large files. #6667

Open JirkaKrivanek opened 4 years ago

JirkaKrivanek commented 4 years ago

Goal

Store 1000 records, each containing two short strings and 1MB byte array.

Actual Results

It seems (from the stack trace below) that it tries to allocate enormous RAM, which is strange to me...

io.realm.exceptions.RealmError: Unrecoverable error. mmap() failed: Out of memory size: 1207959552 offset: 0 in /Users/kk/Desktop/MbiPersisterStudy/RealmBuild/realm-java/realm/realm-library/src/main/cpp/io_realm_internal_OsSharedRealm.cpp line 101 at io.realm.internal.OsSharedRealm.nativeGetSharedRealm(Native Method) at io.realm.internal.OsSharedRealm.(OsSharedRealm.java:171) at io.realm.internal.OsSharedRealm.getInstance(OsSharedRealm.java:241) at io.realm.internal.OsSharedRealm.getInstance(OsSharedRealm.java:231) at io.realm.RealmCache.doCreateRealmOrGetFromCache(RealmCache.java:332) at io.realm.RealmCache.createRealmOrGetFromCache(RealmCache.java:285) at io.realm.Realm.getDefaultInstance(Realm.java:407) at com.thales.gp.mbi.poc.kk.realm.module.b.MbiRealmCitizenTestInteractor.(MbiRealmCitizenTestInteractor.java:47) at com.thales.gp.mbi.poc.kk.realm.module.b.MbiRealmCitizenTestInteractor.lambda$interaction$0(MbiRealmCitizenTestInteractor.java:23) at com.thales.gp.mbi.poc.kk.realm.module.b.-$$Lambda$MbiRealmCitizenTestInteractor$UH_qpTiiS6A-PYDCqBxF2bQysWY.subscribe(lambda) at io.reactivex.internal.operators.completable.CompletableCreate.subscribeActual(CompletableCreate.java:39) at io.reactivex.Completable.subscribe(Completable.java:2302) at io.reactivex.internal.operators.completable.CompletableSubscribeOn$SubscribeOnObserver.run(CompletableSubscribeOn.java:64) at io.reactivex.internal.schedulers.ScheduledDirectTask.call(ScheduledDirectTask.java:38) at io.reactivex.internal.schedulers.ScheduledDirectTask.call(ScheduledDirectTask.java:26) at java.util.concurrent.FutureTask.run(FutureTask.java:237) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:272) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1133) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:607) at java.lang.Thread.run(Thread.java:761)

Steps & Code to Reproduce

Please see the Goal section above. It is a benchmark to consider whether Realm is suitable for us or not: Therefore I can provide the whole Android project for it...

The fragment of code which creates and verifies the data is hereafter - I HOPE I am doing something wrong:

private static final int HUGE_DATA_RECORDS_COUNT = 500;
private static final int HUGE_BINARY_DATA_SIZE = 1000000;

private void createRecordWithHugeBinaryData(final @NonNull Realm realm) {
    Log.e(Tag.TAG, "Creating records with huge binary data...");
    final ElapsedTime measure = ElapsedTime.start();
    for (int index = 0; index < HUGE_DATA_RECORDS_COUNT; index++) {
        final MbiRealmEntityUser user = new MbiRealmEntityUser();
        final MbiRealmEntityUserI userI = user;
        final byte[] customData = new byte[HUGE_BINARY_DATA_SIZE];
        customData[0] = (byte) (0xFF & index);
        userI.setUserName("UN" + index);
        userI.setPassWord("PW" + index);
        userI.setCustomData(customData);
        realm.beginTransaction();
        realm.insertOrUpdate(user);
        realm.commitTransaction();
    }
    Log.e(Tag.TAG, measure.getMessage("Creating records with huge binary data: Consumed: %s"));
}

private void verifyRecordWithHugeBinaryData(final @NonNull Realm realm) {
    Log.e(Tag.TAG, "Verifying records with huge binary data...");

    final ElapsedTime measureTotal = ElapsedTime.start();
    final ElapsedTime measureFindAll = ElapsedTime.start();
    final RealmResults<MbiRealmEntityUser> users = realm.where(MbiRealmEntityUser.class).findAll();
    if (users.size() != HUGE_DATA_RECORDS_COUNT) {
        throw new IllegalStateException();
    }
    Log.e(Tag.TAG, measureFindAll.getMessage("Verifying records with huge binary data: Consumed: Find All: %s"));

    final ElapsedTime measureLoop1 = ElapsedTime.start();
    for (int index = 0; index < HUGE_DATA_RECORDS_COUNT; index++) {
        final MbiRealmEntityUserI user = users.get(index);
        if (user == null) {
            throw new IllegalStateException();
        }
        if (!("UN" + index).equals(user.getUserName())) {
            throw new IllegalStateException();
        }
        if (!("PW" + index).equals(user.getPassWord())) {
            throw new IllegalStateException();
        }
    }
    Log.e(Tag.TAG,
          "Verifying records with huge binary data: Consumed: Per Record: Without Huge Data: " + (measureLoop1.getElapsed() / HUGE_DATA_RECORDS_COUNT)
          + " milliseconds");

    final ElapsedTime measureLoop2 = ElapsedTime.start();
    for (int index = 0; index < HUGE_DATA_RECORDS_COUNT; index++) {
        final MbiRealmEntityUserI user = users.get(index);
        if (user == null) {
            throw new IllegalStateException();
        }
        final byte[] customData = user.getCustomData();
        if (customData == null) {
            throw new IllegalStateException();
        }
        if (customData.length != HUGE_BINARY_DATA_SIZE) {
            throw new IllegalStateException();
        }
        if (customData[0] != ((byte) (index & 0xFF))) {
            throw new IllegalStateException();
        }
    }
    Log.e(Tag.TAG,
          "Verifying records with huge binary data: Consumed: Per Record: With Huge Data: " + (measureLoop2.getElapsed() / HUGE_DATA_RECORDS_COUNT)
          + " milliseconds");

    Log.e(Tag.TAG, measureTotal.getMessage("Verifying records with huge binary data: Consumed: Total: %s"));
}

Version of Realm and tooling

Realm version(s): 6.0.1

Realm Sync feature enabled: No

Android Studio version: 3.5.2

Android Build Tools version: com.android.tools.build:gradle:3.5.2

Gradle version: ?

Which Android version and device(s):

Zhuinden commented 4 years ago

This typically happens if there are unclosed Realm instances on non-looper background threads (or thread pools / schedulers).

Every call to Realm.getDefaultInstance() should have a matching .close(), although in Java you can also use the try(Realm realm = Realm.getDefaultInstance()) { for same effect.

JirkaKrivanek commented 4 years ago

I am running all my test (including the two methods above) this way (I believe it should be OK, but you can consider yourself):

        try (final Realm realm = Realm.getDefaultInstance()) {
            deleteAllData(realm);
        }
        try (final Realm realm = Realm.getDefaultInstance()) {
            createRecordWithHugeBinaryData(realm);
        }
        try (final Realm realm = Realm.getDefaultInstance()) {
            verifyRecordWithHugeBinaryData(realm);
        }
JirkaKrivanek commented 4 years ago

BTW: The problem happens when reading data back from Realm in a separate realm instance (but the same thread). The write of the data succeeds...

Do I have to create a new Realm for each read separately?

Zhuinden commented 4 years ago

Do I have to create a new Realm for each read separately?

No, you only need a new Realm instance for each thread. And it should be closed when it is no longer used.

JirkaKrivanek commented 4 years ago

So NO I was wrong in my previous statement: The crash actually happens on the first line of the fragment below (i.e. when opening Realm):

        try (final Realm realm = Realm.getDefaultInstance()) {
            verifyRecordWithHugeBinaryData(realm);
        }

At this moment, the data are already written...

JirkaKrivanek commented 4 years ago

Or I can put it other way round: I created a realm file which I can never open again :(

Because even restarting the app does not help... The only way how to overcome this is to delete the realm and start over...

Which sounds like a show stopper to me...

Why is it trying to allocate 1.2GB od RAM???

cmelchior commented 4 years ago

How big is the file on disk?

JirkaKrivanek commented 4 years ago

How big is the file on disk?

1152MB

Are you suggesting that Realm always allocates RAM for a whole file?

cmelchior commented 4 years ago

Okay, so the file on disk matches what is trying to be allocated. Realm will memory-map files it opens. It doesn't mean it physically needs that much memory, but we need to be able to allocate that much virtual address space. It sounds like you have a custom build of Android? Perhaps you have a limit that is lower than that.

Generally, if the system is a 32-bit device, the limit on normal Android phones is 300-400 MB, but 64-bit devices would run out of disk space before the virtual address limit is reached.

I'm not sure if that is helpful or not?

Also note, that databases in general (SQLite and Realm alike) are not really suited for storing large amounts of binary data. So depending on what you are trying to do, it might be a lot more performant to store a filepath pointing to a file outside the DB.

JirkaKrivanek commented 4 years ago

@cmelchior Thanks for the elaboration. It actually is helpful. And that is why I am doing benchmarks first (we call it pre-study).

Is there some consolidated report where I would have been able to read these data in advance?

On the other hand, I did those tests firstly on my Huawei P20 Pro and even 15GB data file was working surprisingly well (I tend to say like a charm, even with the encryption ON - I actually doubt I can quickly do anything similarly effective with the encryption, HMAC and ACID features). No problem to store a lot of 15MB byte arrays into it and retrieve it back.

Unlike SQLite which, according to my benchmarks, is much worse (15MB byte array is just impossible due to the read cursor size limit)...

cmelchior commented 4 years ago

I thought we had the section about 32-bit vs. 64-bit bit devices in the FAQ, but I cannot find it 🤔

There are a few sections around file size here: https://realm.io/docs/java/latest/#faq-large-realm-file-size

and if you are building a systems app you might want to read this as well: https://realm.io/docs/java/latest/#how-to-use-realm-in-system-apps-on-custom-roms

But yes, if you really want to store 2MB+ blob data in the database, then you are probably not going to get anywhere using SQLite due to the Cursor limit.

On the other hand. If I remember correctly SQLite does not memory map its files, so if you getting such massive database files, SQLite can still open them where Realm might fail on 32-bit devices (not sure what your custom hardware is running). So either way, you would run into problems.

But generally, Realm is much more efficient at loading data as we give you direct access to the data in the file instead of copying it into a Cursor (it is also faster).

I don't know exactly what you are storing in those binary blobs, but generally, I would recommend you try to store them outside Realm/SQLite if at all possible.

JirkaKrivanek commented 4 years ago

One more question: If I am on 64-bits system: With the huge (like I wrote 15GB) data file: Which is definitely out of physical memory limit of any device: Will Realm manage the physical memory allocations effectively (or is it even done by the underlying OS services - I am not any expert to virtual vs physical files mapping)?

I mean no OOM crashes?

cmelchior commented 4 years ago

We just rely on the OS for the memory-mapping. But that should work fine for that size. Where you might run into problems is if you try to batch-write too much data, since we need to hold the entire write transaction in physical memory before writing it to disk.

JirkaKrivanek commented 4 years ago

Yes, that limit regarding the write transaction I already experienced myself: Huge data is only possible to write in granular transactions, while a lot of small records is much more effective to write in a single transaction...

JirkaKrivanek commented 4 years ago

Thank you guys for your support... I am now trying to close this ticket (hopefully I am allowed to do so)...

The conclusion: As I anticipated in my original post, I am actually misusing Realm...

JirkaKrivanek commented 4 years ago

Sorry, I still had to reopen :)...

Because I still think there must be some error in Realm, which I was missing so far...

"I was able to create a Realm file using the Realm SDK which Realm SDK subsequently cannot open" - can you feel there is something wrong in that sentence?

I would expect Realm to report "You cannot write anything more to me, because then I would become too big and you would not be able to open me"...

That is a kind of the Realm suicide :)

bmunkholm commented 4 years ago

@JirkaKrivanek I don't think there is any practical way we can handle that. We can't know or detect when we can't open a file as it depends on whatever else is in memory and how fragmented the memory is. That can change over time as well, so if for instance, you turn off/on the phone it may open perfectly well.

JirkaKrivanek commented 4 years ago

@bmunkholm It is clear that Realm SDK can create a Realm File which the same Realm SDK just cannot read later (note: In my case, the restart of the app or phone did not help) - or rephrased on purpose: Realm SDK corrupts the Realm File by the legal operations.

So we need to implement monitoring of the size of the Realm File ourselves (protected with some global lock) and if the file exceeds certain size limit then prevent any more writes to it - am I right?

Don't you think the best place to keep Realm data consistency is the Realm itself (we, on the client level, have no idea whether the file will grow by the write or will just reuse some previously freed space)?

Other solution could be having multiple Realm Files (I assume NOT accessing them all simultaneously, rather one at the time only) - am I right - or is there any internal global limit which would prevent me from doing this workaround?

bmunkholm commented 4 years ago

@JirkaKrivanek Well, we are not corrupting the file. You can move the file out of the system an open it with Studio or another desktop application. The file is consistent. But this is more philosophical than practical :-). If there was an easy (or any doable) way for us to detect this we could make it a feature request. But it would be one that would have very little priority as very few people hit this. It's unlikely you can make your own monitoring, just as it's unlikely we can. So my recommendation is that you follow cmelchiors advice and store the binaries outside of Realm in the filesystem instead.

JirkaKrivanek commented 4 years ago

@JirkaKrivanek Well, we are not corrupting the file. You can move the file out of the system an open it with Studio or another desktop application. The file is consistent. But this is more philosophical than practical :-).

Yeah, I am aware of that: My "rephrase" was not exactly honest. But practically, it is true.

If there was an easy (or any doable) way for us to detect this we could make it a feature request. But it would be one that would have very little priority as very few people hit this.

Shall I raise this feature request? (or can this ticked be considered as the feature request - by editing the title - it would make sense, as there is the context discussion here)

Note: In the real world, I can actually imagine the request for something more complex:

It's unlikely you can make your own monitoring, just as it's unlikely we can. So my recommendation is that you follow cmelchiors advice and store the binaries outside of Realm in the filesystem instead.

True, doing system level limit/monitoring is hard on the client level - that should be done by Realm.

But what we can do is the functional limit - e.g. max. 30.000 records of known size to be safe...

cmelchior commented 4 years ago

It is fine to rename this issue. I think we already have an issue tracking "to large" files in Core: https://github.com/realm/realm-core/issues/1935

But ideally, Realm should be able to open files of any size (as long as it fits on the filesystem).

JirkaKrivanek commented 4 years ago

But ideally, Realm should be able to open files of any size (as long as it fits on the filesystem).

YES!

Plus:

dakexuan commented 3 years ago

REALM_JNI: jni: ThrowingException 5, mmap() failed: Out of memory size: 67108864 offset: 1811939328 in /tmp/realm-java@2/realm/realm-library/src/main/cpp/io_realm_internal_OsSharedRealm.cpp line 106, .

I met the same problem,When the file exceeds 1GB,No matter which version of SDK I use.The terrible thing is that I can't open this file any more.Can you help me analyze the reasons?Best of all, I used it by mistake,It's not an internal error of the SDK.thx

JirkaKrivanek commented 3 years ago

@dakexuan: It is not the worst: According to my experiments long time ago, the more native memory the rest of your application eats, the less memory is available for realm (in other words, your 1GB limit may be dynamic per what else your app is currently doing). Therefore it is hard to try to observe the realm file size and prevent adding more data as soon as it grows over some threshold.

Unfortunately enough, it is a basic limit of the core realm architecture. Theay are hardly able to fix this serious bug without completely rewriting it... So they have to silently ignore it...

My conclusion: Unless your app needs only a small data (let us say to be safe 100MB at maximum), DO NOT USE REALM AT ALL.

dakexuan commented 3 years ago

@cmelchior It's frustrating.What do you think