Realm database size grows rapidly and stays large

emuye commented 6 years ago

My app does reads and writes on the main thread and background threads. All background writes occur on DispatchQueues with autoreleasingFrequency set to .workItem.

In some cases users are seeing the Realm size increase rapidly and then stay large (until the next compact). We log when the Realm is over 800MBs and this happens pretty often while the app is running. In some rare cases the writes to Realm will fail until a relaunch when the Realm is compacted.

Goals

Given I'm all ready following what I've seen to be best practices, how can I prevent the ballooning from happening?
Are there any tools to help debug when/why this is happening?
What are some possible next steps? Similar questions on StackOverflow also appear to be unanswered. Similar issues seem to have gone stale and don't have any recent responses: https://github.com/realm/realm-cocoa/issues/5482

Expected Results

The Realm file size stays at a reasonable size and writes continue during the life of the app

Actual Results

The Realm file size grows rapidly (sometimes) and stays large until a compact

Steps to Reproduce

This appears to happen sporadically and we are not able to reproduce at will

Code Sample

Our app has complex models and read/write schemes. We are not yet able to repro at will in our app or a standalone app.

Version of Realm and Tooling

Realm framework version: ? Realm 3.8

Realm Object Server version: ? N/A

Xcode version: ? 9.4.1 - 10.1 beta

iOS/OSX version: ? 11.4-12

Dependency manager + version: ? Cocoapods 1.5.3

tgoyne commented 6 years ago

This is normally caused by Realm objects which are allocated on background threads and then outlive that thread due to being retained by a persistent object. The other source of problems that we've seen people have is with long-lived background threads: if you have something like a persistent thread that opens a Realm and then spends most of its time idle waiting on a mutex you'll run into issues.

Xcode's memory graph debugger tool is a decent way to track down where this is happening. It does require being able to trigger the problem with a debugger attached, but once you enable if you can track down what went wrong after the fact even if you weren't specifically trying to reproduce this problem.

Enable "Malloc Stack" logging on the Diagnostic tabs of the scheme settings, and then run the app and do whatever things makes your app do things with Realm on background threads. Pause the app, and then switch to "View Memory Graph Hierarchy" with the button in the top-right of the debug navigator. Search for RLMRealm (even if you're using the Swift API), and then check the stack trace of each of the live objects. If you've successfully hit the problematic scenario, then one of them should show an allocation stack from an operation on a background thread that has completed without deallocating all of the Realm objects used. The graph shows what objects are pointing to that RLMRealm. Dark lines are strong references that keep it alive, while lighter grey lines are weak references and are harmless (there normally will be at least two weak references from Realm internals).

emuye commented 6 years ago

Thanks for the response @tgoyne

I'm going to spend some time doing code investigation regarding the first potential cause you describe. In order for a Realm object to be allocated on a background thread but retained by a persistent object, wouldn't a threadsafe instance need to be passed? Otherwise, the persistent object would also need to access the allocated object on the same background thread?

Can you clarify the potential cause regarding long lived background threads? If the thread eventually goes away and the DispatchQueue it is on has the autorelease frequency set to .workItem wouldn't the memory also go away eventually?

Thanks for the tips on the memory debugger, if we can ever reproduce this in the debugger I'll take a look.

tgoyne commented 6 years ago

Using an object on a different thread than the one it was allocated on will throw an exception, but if you do something like just assigning it to a property on your view controller and then never touch it again you'll never get an error. Accidentally capturing an object in a block that's stored somewhere and happens to never hit the code path which references the object is a less obviously silly way to hit this.

If the only thing you do with background threads is run one-off blocks via GCD then you don't need to worry about the long-lived thread problems; that's mostly only a thing when people implement their own work queues rather than using the standard platform functionality for it.

emuye commented 6 years ago

@tgoyne thanks again.

We've been thinking a lot about both cases of accidentally capturing an object which was allocated on a background thread (and not accessed) and I'm I'm pretty sure we're not hitting these.

We do only run GCD blocks for background threads so I also don't think that's the issue.

Does anything else come to mind?

I've also been going through similar tickets. The comment made on this ticket seem relevant but are a bit confusing: https://github.com/realm/realm-cocoa/issues/4229#issuecomment-254340124

In this issue the user seems to be following best practices with the autorelease pool wrapping the code in the dispatch block but still sees growth. Can you explain jpsim's comment? While we're not doing 10000 writes we do frequent writes dispatched. Could this cause pinning (jpsims comment seems to indicate that it will)? The recommended solutions still seem to be around using autorelease/invalidate. Is there any way to reliably do frequent writes on background threads (without Realm growth) without batching them?

We've looked back at one of our users who saw really rapid growth. We only have logs starting once the Realm was 800 MBs. It grew from 800 MB to 1.6 GBs in about a minute. We saw frequent writes during this period but nothing that should warrant that kind of growth. It grew for some of the writes in increments of ~64MB

emuye commented 5 years ago

@tgoyne can you take a look at my questions above? The main thing I'm trying to understand is if pinning happens in the case where the main thread has a realm open and background threads do writes. jpsim's comment's in the referenced ticket seems to indicate that doing the concurrent writes with the realm previously opened will cause pinning even with autorelease pools in place on the background threads and the main thread run loop running.

tgoyne commented 5 years ago

A Realm open on the main thread will pin the version its using, but normally will update to the latest version within a few milliseconds of a write happening. Even if you do a very large number of small writes (the worst case for Realm's i/o; write transactions are optimized for batches of changes), the main thread should only lag a few versions behind and each of those versions will be fairly small, so the file will only grow a bit.

If the main thread is unable to refresh, then writes on a background thread will continue to grow the file until the main thread is unblocked. Other than the obvious cause of just outright blocking the main thread's runloop, this also could happen if you are observing very slow-running queries, as we rerun the query on a background thread and then only refresh the main thread once those are all done, to avoid blocking the UI.

jlubawy commented 5 years ago

We were having a similar problem: file size grew to >1GB when it should have been ~10MB at most.

We took the following steps and were able to get our file size back to the expected ~10MB (ordered by most significant file size reduction): 1) Reduced the number of write transactions by "caching" writes (1 transaction for every 50 entries instead of one-for-one). You may risk losing data by doing this, but in our case the source data isn't deleted until we've transferred everything so it didn't matter for us. 2) Moved all writes to a single thread (at least for the data that is written the most). 3) Wrapped all Realm objects created in a background thread in an autoreleasepool.

This book explained the issue very well, and even had some example code: https://store.raywenderlich.com/products/realm-building-modern-swift-apps-with-realm-database-source-code

Another thing I want to try: adding Realm.refresh() calls to our code, I hadn't seen the last sentence of this paragraph otherwise I might have tried that first since it seems way easier: https://realm.io/docs/swift/latest/#seeing-changes-from-other-threads

emuye commented 5 years ago

@jlubawy it sounds like 1 reduces the possibility of pinning. Why is 2 necessary? That's kind of what I was getting at with my previous set of questions. We already have autoreleases around writes on background threads.

@tgoyne Is it possible to avoid Realm pinning and bloat while having concurrent writes on background threads? refactoring our code to have a single write thread would be a lot of work and I still don't have a good understanding of why that would be necessary.

When we've seen the really large Realm we've also seen data that we've written to Realm not appear in existing queries. All we have to go on in this case is the logging and we haven't been able to reproduce this in-house. But it seems like the write fails without an error.

jlubawy commented 5 years ago

@emuye Number 2 seems to fix what the comment you previously linked to says: https://github.com/realm/realm-cocoa/issues/4229#issuecomment-254340124

you're effectively creating 10,000 concurrent versions of the database

It prevents creating multiple copies of the Realm which apparently happens when creating a new Realm instance on different background threads (as a side-effect of using GCD which uses a thread pool and not a single thread). By using a single write thread you limit the number of copies to one.

As for the "data that we've written to Realm not appear in existing queries" issue, you might want to try calling Realm.refresh() before making the query, if you're on a background thread. We had this same issue and adding that fixed it (see the paragraph I linked to in my previous comment).

tgoyne commented 5 years ago

A small number of threads performing concurrent writes should not be a problem (although it's not actually helpful, as only one thread will be writing at a time). If you're spawning background tasks that perform writes in a loop you may benefit from pushing them onto a single serial queue. If you have a fixed set of things that may do background writes at the same time, serializing them probably won't help.

If you perform a write on a background thread and then call dispatch_async(dispatch_get_main_queue(), ...), the block you dispatched to the main thread may end up being run before our notification sent to the main thread telling it to refresh, resulting in you seeing stale data. You can either manually call refresh() in the block or use Realm notifications instead, which are always called after the autorefresh happens.

emuye commented 5 years ago

My attempts to resolve this problem have not been successful. I don't want to move to a serial write queue as it would require a lot of refactoring of my code and I still don't understand why that would be necessary. I understand that Realm writes are serial, but we're not doing this for any Realm benefit but instead because of other factors.

I'm trying to really understand what circumstances could cause the file size to grow so dramatically so I created a sample app to try to force lots of Realm pinning so I could compare the circumstances resulting in growth in the sample app to my own app. But, I haven't been able to see the sample app grow by (many) multiples of the actual database size.

In the sample app I create operations which in turn spawn async jobs that do reads and are long lived. I'd expected the database size to grow dramatically. At the end there should be 50 blocked threads each having pinned a different version of the Realm. This should result in an astronomical database size. Instead what I see is 1280 MB database (~500 is actual data writes). I've removed all the autoreleases from the app but it really shouldn't matter because the long-running reading threads only finish after many writes.

Any idea why this doesn't cause many versions of the Realm to be pinned?

As I mentioned I'm just trying to understand the exact details that cause pinning because none of my attempts to fix it or follow best practices have fixed the problem in my app.

https://gist.github.com/emuye/6ff5edf0f5911fdab432d7a1b00ecd0f

jlubawy commented 5 years ago

Isn't 1,280MB (1.28GB) considered astronomical for only a couple hundred writes? Unless that was a typo.

For what it's worth, I've been running more experiments and I think we've finally settled on the following approach to keep our file sizes down (we never go above 75MB for ~9000 writes):

1) All Realm read/writes are now done from a single background thread using a global Realm object, background operations are queued on the thread's RunLoop. This should prevent any "pinning" or any other sort of data duplication issues since there's now only one thread ever accessing data. To pass data to other threads we copy out the properties we need into another non-Realm Object class (otherwise we'd get an exception for accessing that object on another thread). 2) Caching (I should have called this buffering) no longer has any effect on the file size when using a single thread. I'm not sure why it helped before so I would just ignore what I said about that in my previous posts. 3) I don't believe autoreleasepool is needed anymore either since we refer to a single global Realm object. We are not creating multiple instances on a background thread that need to be released.

I created a Gist to show at a high-level what we are doing now. The ThreadRunner class is an abstraction I made to make running operations on a single thread much easier (it's almost a drop in replacement for DispatchQueue).

I know you said it takes a lot of refactoring of your code to use a single thread, but honestly I don't see any other way to do this with the current Realm design. I'll admit I avoided refactoring myself, but I'm pretty happy with how the code turned out now.

emuye commented 5 years ago

I know you said it takes a lot of refactoring of your code to use a single thread, but honestly I don't see any other way to do this with the current Realm design.

I still don't understand why that would be the case. If you take a look at the sample app the writes include 10 MB of data each so 500 MB of the database size is actual data. 1280 MB would be 2 copies of the data, I would expect to see many many multiples of the database size with this sample app.

Before undertaking a huge refactor I'd like to better understand the limitations of the current approach and exactly what contributes to pinning. @tgoyne any ideas on the my my comment and the sample app above?

rudysuharyadi commented 5 years ago

Isn't 1,280MB (1.28GB) considered astronomical for only a couple hundred writes? Unless that was a typo.

For what it's worth, I've been running more experiments and I think we've finally settled on the following approach to keep our file sizes down (we never go above 75MB for ~9000 writes):

All Realm read/writes are now done from a single background thread using a global Realm object, background operations are queued on the thread's RunLoop. This should prevent any "pinning" or any other sort of data duplication issues since there's now only one thread ever accessing data. To pass data to other threads we copy out the properties we need into another non-Realm Object class (otherwise we'd get an exception for accessing that object on another thread).

Caching (I should have called this buffering) no longer has any effect on the file size when using a single thread. I'm not sure why it helped before so I would just ignore what I said about that in my previous posts.

I don't believe autoreleasepool is needed anymore either since we refer to a single global Realm object. We are not creating multiple instances on a background thread that need to be released.

I created a Gist to show at a high-level what we are doing now. The ThreadRunner class is an abstraction I made to make running operations on a single thread much easier (it's almost a drop in replacement for DispatchQueue).

I know you said it takes a lot of refactoring of your code to use a single thread, but honestly I don't see any other way to do this with the current Realm design. I'll admit I avoided refactoring myself, but I'm pretty happy with how the code turned out now.

Hi Josh,

My application also suffer the same issue. Real realm size only 24MB. Because we use 3rd gen iPad Pro with 6GB physical memory, then our realm file size somehow growing to be 6.45GB and then crash forever. Autoreleasepool already added, we use RLMResult directly as table view data source, and there is a lot background process in our apps.

I also think the same solution as yours, by wrap every write transaction in single thread. But i am afraid that we gonna suffer slower experience when we do it. Because currently, our apps want to fetch thousand of data at once from server, and we gonna do it in multiple thread at once. Using iPad Pro, we can use 8 core for this. I am afraid force every single realm write transaction in single thread gonna make it slower. Can you tell us what is your experience on this?

jlubawy commented 5 years ago

@rudysuharyadi in our case we are only writing/reading a couple times per second, so we haven't had any sort of performance issues. In your case maybe buffering/batching writes would help in that you don't have to start a new Realm write transaction every time new data comes in.

rudysuharyadi commented 5 years ago

@rudysuharyadi in our case we are only writing/reading a couple times per second, so we haven't had any sort of performance issues. In your case maybe buffering/batching writes would help in that you don't have to start a new Realm write transaction every time new data comes in.

We tried separate object from realm by call initWithValue, edit the data, add to mutable array, then pass to Model and save to realm. But by doing this, the database growing immediately from 800KB to 11MB. We can reproduce this easily, even tho we never hit GB. We afraid that by doing this in long run, then the size will growing GB.

1 more question from me, you said you call read also on dedicated background thread. That mean you cannot use RLMResult directly as tableView data source, right, since this on main thread? Then are you copying this result to another collection and pass to main thread? If we have 100k data in 1 tableView, isn't copying to collection gonna be slow?

jlubawy commented 5 years ago

In our case we don't actually display data on the device, we are just collecting it for use later.

RetVal commented 5 years ago

so many developers who use realm have the same issue, look at the issue list, too many duplicated issues.

Please provide a detector/ monitor that can detect something pinned in the app for the developers, or shrink the space when running or realm closing( when open may met mmap failed with too large realm file ), maybe the lots of developers don't want to know the detail of how the realm works.

sameer4 commented 5 years ago

so many developers who use realm have the same issue, look at the issue list, too many duplicated issues.

Please provide a detector/ monitor that can detect something pinned in the app for the developers, or shrink the space when running or realm closing( when open may met mmap failed with too large realm file ), maybe the lots of developers don't want to know the detail of how the realm works.

agree with you. It's a very annoying issue. Yesterday, I had to delete realm db file with size more than 24GB.... i guess i have to store objects in single thread now, and apply other solutions or some other db...any one tried "objectbox?? . how is it compared to realm?

jlubawy commented 5 years ago

I'm not that familiar with the history of Realm, but I'm guessing the single thread limitation might have been from a pre-GCD time period where performance was bought by having multiple copies of the database. Maybe this discussion is better rolled into a feature request for not making copies when the DB is using the same synchoronous DispatchQueue? Meaning different threads but not at the same time.

rudysuharyadi commented 5 years ago

Yeah its really annoying. We already have this issue since 2015 and we dont have solution. Last time we implement write on dedicated thread, even though read still multiple thread. It doesnt work, database still growing gigabyte sometimes. So we implement another solution, compaction if needed. So far so good even though sometimes we have error cannot open realm. Its so damn complex and not easy to use anymore at this point. Objectbox for swift is pretty new, but they have nice feature like unique constraint. I never tried it before though, really curious with that one.

jcarlosperez commented 5 years ago

Last time we implement write on a dedicated thread, even though read still multiple threads. It doesn't work, the database still grows multiple gigabytes sometimes.

I've also been wrangling with this issue for a while and writing from a dedicated thread is not enough. We refactored so everything related to realm is on one thread. Database stays much smaller. 7mb average whereas >2GB before.

rudysuharyadi commented 5 years ago

Last time we implement write on a dedicated thread, even though read still multiple threads. It doesn't work, the database still grows multiple gigabytes sometimes.

I've also been wrangling with this issue for a while and writing from a dedicated thread is not enough. We refactored so everything related to realm is on one thread. Database stays much smaller. 7mb average whereas >2GB before.

So every RLMResults you will copy to NSArray and give it to main thread to be consume by UI?

MenSoon commented 4 years ago

Tell me how to deal with realm data files growing exponentially, starting at about 50K, using try? Realm () increased dozens of times later, such as 50K -- > 50M Why does this happen? My data is stored in batches and I call try asynchronously in a lot of places? Realm () How to solve

sergbuk commented 4 years ago

So every RLMResults you will copy to NSArray and give it to main thread to be consume by UI?

I think it's better than have the "couldn't allocate memory..." error although we can not use some nice features of RLMResults since we copy the data.

Pros of this solution I can see is quite good performance and simplicity of using Realm from the developer point of view (models, versioning, etc).

rudysuharyadi commented 4 years ago

So every RLMResults you will copy to NSArray and give it to main thread to be consume by UI?

I think it's better than have the "couldn't allocate memory..." error although we can not use some nice features of RLMResults since we copy the data.

yeah agree.

leemaguire commented 4 years ago

Going to close this issue as it's gone off in a tangent from the original question. If you have any other questions please open a new issue.

Thanks.

realm / realm-swift