Closed freak4pc closed 2 years ago
Hello @freak4pc, thanks for the report!
A background crash that isn't necessarily directly related to GRDB, but does have it "near the surface" and close to the crashing thread. I've also specifically wanted to share some of the code above to see if there's anything "fishy" that pops to your eyes that could cause it.
Nothing yet. I understand that you have several similar crashes, and that if it always the main thread that happens to crash, you also always have sqlite3_step
running in some other thread.
I do not spot anything fishy in your code or the stack trace. Identifying the Cause of Common Crashes and Understanding the Exception Types in a Crash Report list cases that I could not make match with your crash report (you may want to have a second look).
I did suspect Data Protection, but even if that was the cast, I'd expect it to just throw an error (catched by the asyncWrite) and not crash the app.
It's been a few years I wrote the Data Protection documentation chapter, so it may now be out of date. I'm positive I could observe, catch, and recover from regular errors of code SQLITE_IOERR
/SQLITE_AUTH
during accesses to an unavailable database in the background.
We may have to check if this chapter is still accurate, but first I recommend grabbing more information about your crash. If you can reproduce the crash on a device in your hands, you'll have access to the raw device console with Console.app. This console is awfully verbose, but it is very likely to give us precious contextual information.
Hello @freak4pc, do you have any update?
Hey @groue - we're still seeing these crashes but it's all users in different countries, so we don't have access to their physical device (and definitely not in the immediacy needed to get a console dump).
I'm not entirely sure what to do. I guess we can close this and wait to see if I can get any new findings later on.
OK, let's close this. Feel free to report any finding, in case it could help other users 👍
Hey @groue, Appreciate if you can re-open this, I have more clarity about what is (probably) happening.
We've noticed this happening a lot more lately, and I believe looking at the crash logs, that the watchdog is terminating the app due to the main thread being occupied for too long (over 10s, which is the watchdog limit).
We did a "test" where we nil
the database pool whenever the app goes to background, and then reinit it when it comes back to foreground and the crash seems to stop.
This is confirmed by some of the crash logs:
Exception Type: EXC_CRASH (SIGKILL)
Exception Codes: 0x0000000000000000, 0x0000000000000000
Exception Note: EXC_CORPSE_NOTIFY
Termination Reason: FRONTBOARD 2343432205
<RBSTerminateContext| domain:10 code:0x8BADF00D explanation:scene-update watchdog transgression: application<com.monday.monday>:5492 exhausted real (wall clock) time allowance of 10.00 seconds
ProcessVisibility: Background
ProcessState: Running
WatchdogEvent: scene-update
WatchdogVisibility: Background
WatchdogCPUStatistics: (
"Elapsed total CPU time (seconds): 44.010 (user 28.430, system 15.580), 67% CPU",
"Elapsed application CPU time (seconds): 10.220, 15% CPU"
) reportType:CrashLog maxTerminationResistance:Interactive>
As you see the code is 0x8BADF00D
(0x Ate Bad Food):
Looking into the desymbolicatd crash log main thread I saw two GRDB frames:
Thread 0 name: Dispatch queue: com.apple.main-thread
Thread 0 Crashed:
0 libsystem_kernel.dylib 0x1b803166c __psynch_mutexwait + 8
1 libsystem_pthread.dylib 0x1f1c1b3cc _pthread_mutex_firstfit_lock_wait + 84
2 libsystem_pthread.dylib 0x1f1c2250c _pthread_mutex_firstfit_lock_slow + 240
3 Foundation 0x18284d904 -[NSRecursiveLock lock] + 28
4 GRDB 0x1051fa2cc 0x105050000 + 1745612
5 GRDB 0x1050b5790 0x105050000 + 415632
6 libdispatch.dylib 0x180d01914 _dispatch_call_block_and_release + 32
7 libdispatch.dylib 0x180d03660 _dispatch_client_callout + 20
8 libdispatch.dylib 0x180d11b60 _dispatch_main_queue_callback_4CF + 944
9 CoreFoundation 0x181049cd4 __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__ + 16
10 CoreFoundation 0x181003eac __CFRunLoopRun + 2540
11 CoreFoundation 0x1810173b8 CFRunLoopRunSpecific + 600
12 GraphicsServices 0x19c9a738c GSEventRunModal + 164
13 UIKitCore 0x1839b76a8 -[UIApplication _run] + 1100
14 UIKitCore 0x1837367f4 UIApplicationMain + 2092
15 monday 0x103518460 0x102428000 + 17761376
16 dyld 0x104bbda24 start + 520
I was able to symbolicate it using atos
and it seems to be related to the value observer:
So I'm assuming this means:
Weird thing is that the crash log doesn't point to any of my code but only GRDB-specific code here.
Does this make sense for you? Did you ever bump into this? Any thoughts on what I could do to mitigate the issue, aside for closing and re-opening the database pool on background/foreground?
Thank you so much!
P.S. As you see this isn't the same crash as the original, but it seems similar (at least from the EXC_CRASH (SIGKILL) EXC_CORPSE_NOTIFY exception angle). If I have any more thoughts on that original crash I'll share those as well.
P.S.2 We also sometimes note these in the logs around the crash:
Message from debugger: Terminated due to signal 9
Background task still not ended after expiration handlers were called
Thanks for the update, @freak4pc
So it looks like a deadlock. Which GRDB version are you using?
Thanks, we're on 5.15
Thanks, we're on 5.15
OK. ValueObserver.notifyChanges
does indeed use an NSRecursiveLock, and this may be the one that is deadlocked. It may also be a lock owned by your app, but let's suppose it's the GRDB one.
I hope your crash log contains the stack trace of all of your application threads. Can you notice other threads that are busy running some GRDB method?
Thanks for your comment :)
I'm seeing a few more operations on Thread 18 name: Dispatch queue: com.monday.database.writer
:
Thread 18 name: Dispatch queue: com.monday.database.writer
Thread 18:
0 libswiftCore.dylib 0x185db180c tryCast(swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*, swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*, swift::TargetMetadata<swift::InProcess> const*&, swift::TargetMetadata<swift::InProcess> const*&, bool, bool) + 2516
1 libswiftCore.dylib 0x185db180c tryCast(swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*, swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*, swift::TargetMetadata<swift::InProcess> const*&, swift::TargetMetadata<swift::InProcess> const*&, bool, bool) + 2516
2 libswiftCore.dylib 0x185db1454 tryCast(swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*, swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*, swift::TargetMetadata<swift::InProcess> const*&, swift::TargetMetadata<swift::InProcess> const*&, bool, bool) + 1564
3 libswiftCore.dylib 0x185db1454 tryCast(swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*, swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*, swift::TargetMetadata<swift::InProcess> const*&, swift::TargetMetadata<swift::InProcess> const*&, bool, bool) + 1564
4 libswiftCore.dylib 0x185db0d2c swift_dynamicCastImpl(swift::OpaqueValue*, swift::OpaqueValue*, swift::TargetMetadata<swift::InProcess> const*, swift::TargetMetadata<swift::InProcess> const*, swift::DynamicCastFlags) + 72
5 libswiftFoundation.dylib 0x1856f2004 __JSONDecoder.container<A>(keyedBy:) + 540
6 libswiftFoundation.dylib 0x1856e62bc protocol witness for Decoder.container<A>(keyedBy:) in conformance __JSONDecoder + 24
7 libswiftCore.dylib 0x185d7c248 dispatch thunk of Decoder.container<A>(keyedBy:) + 32
8 monday 0x10254f2d4 0x102428000 + 1209044
9 monday 0x10254f7d8 0x102428000 + 1210328
10 libswiftCore.dylib 0x185d7c124 dispatch thunk of Decodable.init(from:) + 32
11 libswiftFoundation.dylib 0x18570b87c _JSONUnkeyedDecodingContainer.decode<A>(_:) + 3308
12 libswiftFoundation.dylib 0x18570cdf4 protocol witness for UnkeyedDecodingContainer.decode<A>(_:) in conformance _JSONUnkeyedDecodingContainer + 16
13 libswiftCore.dylib 0x185aa3984 Array<A>.init(from:) + 580
14 libswiftCore.dylib 0x185aa3a48 protocol witness for Decodable.init(from:) in conformance <A> [A] + 32
15 libswiftCore.dylib 0x185d7c124 dispatch thunk of Decodable.init(from:) + 32
16 libswiftFoundation.dylib 0x1856e8394 __JSONDecoder.unbox_(_:as:) + 1268
17 libswiftFoundation.dylib 0x1856d3fc4 JSONDecoder.decode<A>(_:from:) + 1548
18 libswiftFoundation.dylib 0x18570bfc4 dispatch thunk of JSONDecoder.decode<A>(_:from:) + 56
19 MondayPersistence 0x105c2eb10 0x105c24000 + 43792
20 MondayPersistence 0x105c2e644 0x105c24000 + 42564
21 monday 0x103aab214 0x102428000 + 23605780
22 monday 0x103aab5dc 0x102428000 + 23606748
23 MondayPersistence 0x105c2ee9c 0x105c24000 + 44700
24 monday 0x103aab6a8 0x102428000 + 23606952
25 GRDB 0x1050ead60 0x105050000 + 634208
26 GRDB 0x1050ec458 0x105050000 + 640088
27 GRDB 0x10506dcfc 0x105050000 + 122108
28 GRDB 0x1050eb2d0 0x105050000 + 635600
29 GRDB 0x1050eb634 0x105050000 + 636468
30 GRDB 0x1050ec158 0x105050000 + 639320
31 monday 0x102e36820 0x102428000 + 10545184
32 monday 0x102e368fc 0x102428000 + 10545404
33 monday 0x102e36944 0x102428000 + 10545476
34 monday 0x102e369e4 0x102428000 + 10545636
35 GRDB 0x1050ea884 0x105050000 + 632964
36 GRDB 0x1050ea920 0x105050000 + 633120
37 GRDB 0x1051faf0c 0x105050000 + 1748748
38 GRDB 0x1051f9810 0x105050000 + 1742864
39 GRDB 0x1051fae70 0x105050000 + 1748592
40 GRDB 0x105077dc8 0x105050000 + 163272
41 GRDB 0x1051fa04c 0x105050000 + 1744972
42 GRDB 0x1051f9994 0x105050000 + 1743252
43 GRDB 0x1051fae4c 0x105050000 + 1748556
44 GRDB 0x1051f0f00 0x105050000 + 1707776
45 GRDB 0x105077a08 0x105050000 + 162312
46 GRDB 0x10507b118 0x105050000 + 176408
47 GRDB 0x105081388 0x105050000 + 201608
48 GRDB 0x10507a57c 0x105050000 + 173436
49 GRDB 0x10507b304 0x105050000 + 176900
50 GRDB 0x10507afdc 0x105050000 + 176092
51 GRDB 0x1051f9038 0x105050000 + 1740856
52 GRDB 0x1050b8e00 0x105050000 + 429568
53 GRDB 0x10516cb54 0x105050000 + 1166164
54 GRDB 0x1050b5790 0x105050000 + 415632
55 libdispatch.dylib 0x180d01914 _dispatch_call_block_and_release + 32
56 libdispatch.dylib 0x180d03660 _dispatch_client_callout + 20
57 libdispatch.dylib 0x180d0ade4 _dispatch_lane_serial_drain + 672
58 libdispatch.dylib 0x180d0b958 _dispatch_lane_invoke + 392
59 libdispatch.dylib 0x180d161a8 _dispatch_workloop_worker_thread + 656
60 libsystem_pthread.dylib 0x1f1c1a0f4 _pthread_wqthread + 288
61 libsystem_pthread.dylib 0x1f1c19e94 start_wqthread + 8
I'll try symbolicating these and add it to this thread.
Here's the full symbolicated log:
2022-01-2022-29-24.6190+0200-4788e7062cae210b2b52bc1e4f77d4dc2dafcec8.txt
I see that there's some of our code in between the GRDB, related code - which makes me think it's some conflict we're not aware of?
UsersProvider.users(sql:)
looks like this (database.rx.observe
is a simple RxSwift wrapper that uses ValueObservation):
func users(sql: String? = nil) -> Observable<[UserModel]> {
database.rx.observe { db in
var query = UserModel
.filter(UserModel.Column.isDisabled == false)
if let sql = sql, !sql.isEmpty {
query = query.filter(sql: sql)
}
return try query
.order(UserModel.Column.name.asc)
.fetchAll(db)
}
}
Sorry for another separate reply here, just wanted to share another crash report where GRDB isn't on the main thread, but it's still in that same "writer queue" with monday stuff in between some GRDB operations.
The similarity is there, I'm wondering if it's somehow related. Perhaps I'm doing something I'm not supposed to, even though I'm always performing work inside transactions, etc.
Here it is: crash2.txt
Focusing on the first crashlog, I don't understand why the main thread appears locked as it tries to notify a database change to your app. The second thread that involves GRDB is starting a new observation (and is not waiting).
May I ask how your application is using value observations? How frequently are they started? What is your app doing when fresh database values are notified? Does it happen that observations feed themselves? For example, are observations started from change notifications? Is the database modified as a result of change notifications? Anything that looks "fancy" in any way (please have a low threshold for "fancy")?
We are using them pretty frequently in some places in the app. We have about 13 instances of them throughout the app.
Since the observations are wrapped in an Observable
it might be a bit tricky tracking down what consumers might be writing to the database, but those are (should) still be isolated.
Two big things that I think could be related:
asyncWrite
, and a different area observing them. Still, don't think this is too fancy ...Is there any thing invalid with doing a separate write transaction inside a value observation? I don't think we have this case but you did mention it as an option.
Nothing really rings any bell to me.
Is there any thing invalid with doing a separate write transaction inside a value observation?
No, it is not invalid. I was looking for anything that could have been overlooked.
Is there any additional information I could try and debug around this? I'm happy to try anything... I could share all 13 ValueObservations in the app but I'm not sure it'd actually help :)
I really don't know, @freak4pc... You're at the the best position for reducing and reproducing, I'm afraid. I'd love to help but I'm no wizard.
Sure, I agree, but the crash in the original message does seem to revolve around locking inside GRDB so I thought you might have some thoughts (of course I didn't think you're any wizard ;-) ). I'll try to keep poking at this. Thanks.
Sure. I'll leave the issue open for a while.
Two updated on this:
We've started filtering emissions from ValueObservations while the app is in the background - this seems to help with the crashes (we'll monitor)
We noticed the app talks about a UIBackgroundTask not being completed in the expiration threshold in the background (Background task still not ended after expiration handlers were called
). The relevant background task is most likely here. Could you share a bit what releaseMemory
does? Is there anything that could cause it to halt? Might help me further debug on our end?
https://github.com/groue/GRDB.swift/blob/master/GRDB/Core/DatabasePool.swift#L214-L224
Thanks!
Could you share a bit what releaseMemory does?
Sure:
Is there anything that could cause it to halt? Might help me further debug on our end?
DatabasePool.releaseMemory
calls Pool.barrier
, and waits until all current reads are completed (so that it can close all reader connections).
DatabasePool.releaseMemory
callsPool.barrier
, and waits until all current reads are completed (so that it can close all reader connections).
More precisely, it waits until no one is waiting for a read access. It then closes all reader connections.
I'm not sure it is useful to keep this issue opened, since no progress is done. Let's wait for more actionable input!
What did you do?
We noticed a few unexplained crashes on AppStore Connect that seem to happen while a user is in the background. The crashing thread isn't pointing to GRDB specifically, but GRDB-related code can be seen in Thread 2 of the crash log (attached), and since we've added it relatively recently, it draws a bit of sense to suspect it.
crash_report.txt (Some of the symbols have been starred-out for obfuscation)
The part with
upsertEntity
in the stack trace has code that looks similar to the followingAnd
upsertEntity
itself looks like this:What did you expect to happen?
No crashes while user is in background
What happened instead?
A background crash that isn't necessarily directly related to GRDB, but does have it "near the surface" and close to the crashing thread. I've also specifically wanted to share some of the code above to see if there's anything "fishy" that pops to your eyes that could cause it.
I did suspect Data Protection, but even if that was the cast, I'd expect it to just throw an error (catched by the
asyncWrite
) and not crash the app.Environment
GRDB flavor(s): GRDB GRDB version: 5.10 Installation method: CocoaPods Xcode version: 13.0 Swift version: 5.5 Platform(s) running GRDB: iOS macOS version running Xcode: 11.6
Demo Project
Unfortunately can't share an actual project and also can't directly "repro" this. I've shared the crash log and relevant code portions above, which I hope can help a bit with figuring this out.