realm / realm-swift

Realm is a mobile database: a replacement for Core Data & SQLite
https://realm.io
Apache License 2.0
16.34k stars 2.15k forks source link

Deadlock race condition using realm within app group #4797

Open pauluhn opened 7 years ago

pigeondev2 commented 7 years ago

Hi @pauluhn. Thanks for reaching out. I wanted to let you know that we've received your report and that someone will review what you've shared and follow-up with you soon.

tgoyne commented 7 years ago

I'm able to reproduce this on-device (it works fine in the simulator) and am looking into how to debug it.

tgoyne commented 7 years ago

Seems to be caused by https://github.com/realm/realm-core/pull/2402. If I revert that then everything works... until it crashes due to the exact problem that PR was trying to fix.

tgoyne commented 7 years ago

Wrapping the write transaction with UIApplication.shared.beginBackgroundTask() / UIApplication.shared.endBackgroundTask() makes me unable to trigger this. One of the problems is that when switching between apps, the OS may decide to suspend the app while it's holding the write lock, which will then block the other two apps until the user switches back to the app which was in the middle of a write, and explicitly marking it as a background task stops that. This is the case where a push notification would get things "unstuck", as the receiving a remote notification will wake up suspended applications.

I think we also have a bug where being suspended at the wrong point in a write transaction breaks things entirely as I've sometimes seen things get in to a state where none of the apps hold the write lock and are all waiting for it, but not allowing suspension during writes is enough to at least dodge that issue.

pauluhn commented 7 years ago

Is that bug being tracked?

Thanks for the workaround on preventing the issue. And I saw that the docs were updated re: writes being synchronous and blocking.

akisute commented 7 years ago

Looks like this is exactly an issue being caused in my app. The stack trace I've recovered looks very similar to the reported one, like this (sorry I couldn't symbolicate the Realm.framework itself by mistake):

Incident Identifier: 8B232237-B9A7-42A0-A8FE-64B12F2C6C96
CrashReporter Key:   95e1e2725812001ba97976dbe6dbd9ea7c4ced49
Hardware Model:      iPhone7,2
Process:             myapp [356]
Path:                /var/containers/Bundle/Application/A679D475-E848-4AA9-9001-6934F9338274/myapp.app/myapp
Identifier:          com.akisute.myapp
Version:             240.0.0 (2.4.0)
Code Type:           ARM-64 (Native)
Parent Process:      launchd [1]

Date/Time:           2017-06-01 14:36:26.26 +0900
Launch Time:         2017-06-01 14:01:08.08 +0900
OS Version:          iOS 9.3.5 (13G36)
Report Version:      105

Exception Type:  00000020
Exception Codes: 0x000000008badf00d
Exception Note:  SIMULATED (this is NOT a crash)
Highlighted by Thread:  0

Application Specific Information:
com.akisute.myapp failed to scene-update after 10.00s

Elapsed total CPU time (seconds): 3.500 (user 3.500, system 0.000), 18% CPU 
Elapsed application CPU time (seconds): 0.006, 0% CPU

Filtered syslog:
None found

Thread 0 name:  Dispatch queue: com.apple.main-thread
Thread 0:
0   libsystem_kernel.dylib          0x0000000180c8bf6c 0x180c70000 + 114540
1   libsystem_pthread.dylib         0x0000000180d5a39c 0x180d54000 + 25500
2   Realm                           0x0000000100f2bcf4 0x100b5c000 + 3996916
3   Realm                           0x0000000100e0e020 0x100b5c000 + 2826272
4   Realm                           0x0000000100e0df20 0x100b5c000 + 2826016
5   Realm                           0x0000000100e0def4 0x100b5c000 + 2825972
6   Realm                           0x0000000100e0de90 0x100b5c000 + 2825872
7   Realm                           0x0000000100de0f24 0x100b5c000 + 2641700
8   Realm                           0x0000000100de0d3c 0x100b5c000 + 2641212
9   Realm                           0x0000000100bf36c4 0x100b5c000 + 620228
10  Realm                           0x0000000100d8ea98 0x100b5c000 + 2304664
11  Realm                           0x0000000100d45bbc 0x100b5c000 + 2005948
12  RealmSwift                      0x000000010149ea88 0x10146c000 + 207496
13  RealmSwift                      0x000000010149e958 0x10146c000 + 207192
14  myappLib                        0x0000000101ea41d4 HistoryManager.markItemAsDone(id : String) -> () (HistoryManager.swift:44)
15  myapp                           0x00000001003a55e0 ListDetailViewController.viewWillAppear(Bool) -> () (ListDetailViewController.swift:378)
16  myapp                           0x00000001003a5ba4 @objc ListDetailViewController.viewWillAppear(Bool) -> () (ListDetailViewController.swift:0)
17  UIKit                           0x000000018626d374 0x186240000 + 185204
18  UIKit                           0x000000018626d0e8 0x186240000 + 184552
19  myapp                           0x000000010025e824 -[RootContainer pushViewControllers:animated:completion:] (RootContainer.m:403)
20  myapp                           0x000000010025df98 -[RootContainer pushViewController:animated:] (RootContainer.m:360)
21  myapp                           0x00000001001e1108 -[RootViewController pageView:didSelectItem:] (RootViewController.m:354)
22  myapp                           0x00000001002a6e18 TopPageView.tableView(UITableView, didSelectRowAt : IndexPath) -> () (TopPageView.swift:633)
23  myapp                           0x00000001002a71e8 @objc TopPageView.tableView(UITableView, didSelectRowAt : IndexPath) -> () (TopPageView.swift:0)
24  UIKit                           0x000000018638bdc4 0x186240000 + 1359300
25  UIKit                           0x00000001864497d4 0x186240000 + 2136020
26  UIKit                           0x00000001865070c8 0x186240000 + 2912456
27  UIKit                           0x0000000186514a80 0x186240000 + 2968192
28  UIKit                           0x00000001862465a4 0x186240000 + 26020
29  CoreFoundation                  0x00000001810a8728 0x180fc8000 + 919336
30  CoreFoundation                  0x00000001810a64cc 0x180fc8000 + 910540
31  CoreFoundation                  0x00000001810a68fc 0x180fc8000 + 911612
32  CoreFoundation                  0x0000000180fd0c50 0x180fc8000 + 35920
33  GraphicsServices                0x00000001828b8088 0x1828ac000 + 49288
34  UIKit                           0x00000001862be088 0x186240000 + 516232
35  myapp                       0x00000001001055e8 main (main.m:16)
36  libdyld.dylib                   0x0000000180b6e8b8 0x180b6c000 + 10424

But the thing is, this is caused between my app and the Today Extension that are using the same app group container, not by an individual applications. What's worse is, it looks like the Today Extension is the process locking up the the Realm database first, since not matter how many times I force-quit the main application the Realm database is still deadlocked, but once I open up the Today Extension UI on my device this deadlock is solved, 100% succeeds.

This means the potential fix @tgoyne is pushed in #4827 is completely helpless for this situation, because obviously you can't use UIApplication.shared.beginBackgroundTask() / UIApplication.shared.endBackgroundTask() from the Today Extension (or other iOS Extensions of course).

I have completely no idea how to fix this in an alternative way... currently I just decided not to use the shared app group container for now, which of course hinders my application's functionality a lot, but there's no choice.

jpsim commented 7 years ago

This is something we're hoping to bring up at Apple's Core OS lab during WWDC next week.

tgoyne commented 7 years ago

Please file a radar asking for the ability to perform background tasks from extensions.

akisute commented 7 years ago

Sure. That sounds like you've spoke with them and had no joy (´・_・`)

akisute commented 7 years ago

Here's a copy of my radar in OpenRadar: http://www.openradar.me/radar?id=4931832244076544

I doubt they'll hear this but anyway, we do have to do what we can do.

egoldfarb commented 7 years ago

My app has just been afflicted with this issue, due to an extension crash or suspension while in a Realm transaction, the main app is permanently deadlocking.

If/when you talked to Apple engineers about this issue, was NSFileCoordinator mentioned?

https://developer.apple.com/documentation/foundation/nsfilecoordinator

Since there isn't a way to perform background tasks in an extension, please consider using NSFileCoordinator, which is the Apple-recommended way to deal with reading/writing files within an app container by multiple processes.

Starting a coordinated read or write will ensure that the app and extension are given enough time to complete the file operation, and background tasks shall be started appropriately under the covers, even within an extension.

jpsim commented 7 years ago

If/when you talked to Apple engineers about this issue, was NSFileCoordinator mentioned?

Yes, in the sense that they admitted that there's no good official way to do this on Darwin platforms, but that we could misuse NSFileCoordinator to accomplish this.

CraigLn commented 6 years ago

I'm not sure if this has been discussed, but Apple put out a tech note regarding 'safe' multi-process file transactions.

egoldfarb commented 6 years ago

That tech note is somewhat old so some of the information is stale. Apple isn't very good at updating their old tech notes.

Since then, iOS 8.2 documentation refers to a way to NSProcessInfo.performExpiringActivityWithReason:usingBlock: to perform longer-running tasks in extensions

https://developer.apple.com/documentation/uikit/uiapplication/1623031-beginbackgroundtaskwithexpiratio "To extend the execution time of an app extension, use the performExpiringActivityWithReason:usingBlock: method of NSProcessInfo instead."

iOS 8.2 also added NSExtensionHostDidEnterBackgroundNotification

Sounds like a good topic to bring up again at WWDC, unless there is a good surprise waiting in iOS 12.

CraigLn commented 6 years ago

Yeah I assumed they had things covered since then, but hoped that something might have been related. Fingers crossed for iOS 12 Extension information.

pankajsoni19 commented 5 years ago

We are also experiencing app freeze while using callkit and notification extension and accessing realm within there callbacks. Though we have removed realm usage from notification extension, it has solved 95% of freezes. Though it is still needed in the callkit.

If the process executing realm transaction aborts, why is realm file, not accessible from main ui thread?

robbiet480 commented 5 years ago

We are experiencing this issue in Home Assistant now as well. We have quite a few extensions in play, so this isn't just isolated to the Today Extension.

nalexn commented 3 years ago

@manuroe seems like you've found a way to fix it in that other project, could you give advice to the realm core team or to the end users on how to approach this deadlock issue? There are so many critical internal issues in the realm that stay unattended forever that I'm about to port out!

manuroe commented 3 years ago

@manuroe seems like you've found a way to fix it in that other project, could you give advice to the realm core team or to the end users on how to approach this deadlock issue? There are so many critical internal issues in the realm that stay unattended forever that I'm about to port out!

Unfortunately, I have no fix, just a poor workaround where we avoid to write to the Realm DB from our notification extension. Network data is written to files. Those files will be processed again by the app on its next startup. This implementation works well with this matrix.org project but it cannot be applied everywhere.

pankajsoni19 commented 3 years ago

Around 2-3 years back we had the same issue. We followed the same approach that @manuroe mentions. We write it to a file group, and on app start-up read those files and push data into realm.

Since that project I have not used realm anymore.