Closed nkbelov closed 5 months ago
Hello @nkbelov,
meaning that multi-step statements will be allowed to re-acquire the file lock even after the DB became suspended.
Yes indeed. I thought it was enough to make the check in the first step. If it would be called on each step, that would probably have an impact on performance (I did not perform any benchmark, though).
If the app is executing a multi-steps statement, it should have the time to finish them before the app transitions from the background to the suspended state. If it does not, I'm not sure GRDB should be held responsible. Maybe such apps should perform iterations in batches, for example.
[...] so it still looks like the DB is executing a write after a suspense.
The write lock was still held when the app turned suspended, yet. This might mean that a write is happening. But maybe a write transaction is still open. Maybe a statement was not properly released.
I'm sorry but I do not live in your app, so I can't tell what's happening and I won't make uneducated guesses. Sharing SQLite databases on iOS is hard. I recommend that you audit each and everyone of the writes performed by the app and make a clear understanding of how it could relate with suspension.
Also maybe check that the suspension notification is sent when the app is entering the background - this is a UIApplication concept, pre-SwiftUI. Suspension and resuming is some code I would expect to see in an UIApplicationDelegate. You can put it where you want, but make sure you aren't playing games with the OS — it bites.
@nkbelov, some ideas came to me reading your stack trace:
I recommend that you audit each and everyone of the writes performed by the app
Well, the guilty write is just in front of our eyes 😅
At first I assumed that the database was indeed suspended. I was surprised that DatabasePool.writeInTransaction
did not throw an error.
Then I realized it won't do so by default. I thus recommend that you add this line in your application, before you open the DatabasePool
:
+configuration.defaultTransactionKind = .immediate
let dbPool = try DatabasePool(path: path, configuration: configuration)
It was recently added to the recommendations for shared databases, because it has other benefits (see #1485). The next GRDB7 version will perform this configuration automatically.
Later on I saw that the INSERT statement performed by upsert
was not caught by checkForSuspensionViolation(from:)
. I'm thus pretty sure that the database was not suspended when the crash occurred.
You should be able to test my advice and assumptions: suspend, and check what happens:
func test_database_suspension_prevents_some_operation() {
let xxxxx = XXXXX()
NotificationCenter.default.post(name: Database.suspendNotification, object: self)
XCTAssertThrows(try xxxxx.setUpDatabaseStream())
}
Note to myself: posting notifications make it impossible to run tests in parallel. GRDB should expose publicly the suspend()
and resume()
methods, so that one can write tests of database suspension without notifications.
Thanks @groue. By all means, the error is very likely in our user code and I've been suspecting that we're incorrectly sending the suspend notifications; I've just noticed this lack of checkForSuspensionViolation
for multistep operations while looking through the GRDB source to see what guarantees this flag exactly gives, esp. w.r.t. already enqueued statements.
I'm not intricately familiar with the inner machinery of SQLite, but it might actually turn out that this sole check is enough, since if a multistep statement is being tracked as per here, then sqlite3_interrupt
will interrupt it as well, and all statements that were issued before DB suspended will correctly cancel.
If not, however, I wonder if this is an opportunity to enhance this a bit (regardless of the particularities of my specific case), since this situation looks to me akin a Task
never checking for cancellation (except for the one time at the beginning), and it seems that the majority of 0xdead10cc
triggers arise when code fails to shut itself down on time and continues to (re-)claim resources. Perhaps using an atomic boolean for the isSuspended
flag could address the potential performance impact, too.
I will be trying out the recommendations, thanks a lot. The code in question was indeed written before the note on .immediate
was published.
I've just noticed this lack of
checkForSuspensionViolation
for multistep operations while looking through the GRDB source to see what guarantees this flag exactly gives, [...]
It is indeed called once before the first step (screenshot).
We can say that the 0xdead10cc prevention mechanism of GRBD makes optimistic assumptions today.
Last time I checked, the system was giving a lot of time between the notification of the expiration of the background state and the actual suspension. I wanted to use this delay to let the app successfully commit a short write. That's why we wait for the app to start a "forbidden" statement before throwing an error. Currently those are statements that acquire the write lock, not read-only statements or statements that release the lock.
The comparison with a Task
that does not perform all possible checks for cancellation is probably accurate. For example, here is a bad scenario that is not detected:
checkForSuspensionViolation
, so app keeps on running.If db would have been suspended during the long iteration, the fetch would have failed thanks to sqlite3_interrupt
.
So probably checkForSuspensionViolation
could be improved so that the first step of a read-only statement is interrupted when we know for sure that the write lock has been acquired. There is no C function to query this state AFAIK, but the first successful step of a statement that returns false from sqlite3_stmt_readonly
should be enough to detect the transition 🤞.
If you can think of other nasty scenarios, I'm curious.
I'm closing this issue, because I'm pretty sure the app did not suspend the database, leading to the 0xdead10cc exception. Database suspension remains the task of the app, because despite all my attempts at doing it automatically, we eventually face the difficulty of determining when to resume (suspending is easy, it's resuming that's hard).
I'll remember to revisit the topic, though, as discussed above.
What did you do?
TestFlight version of an app crashed with
0xdead10cc
; stacktrace:What did you expect to happen?
The DB is configured with
observesSuspensionNotifications
, and the app is set up to send a notification in a SwiftUI handler:Expected the statement to either abort through
sqlite3_interrupt
or not get executed (checkForSuspensionViolation
).Environment
GRDB flavor(s): Stock GRDB version: 6.18.0 Installation method: SPM Xcode version: 15.3 Swift version: 5.10 Platform(s) running GRDB: iOS macOS version running Xcode: Sonoma 14.1.1 (23B81)
Upon inspection of the call path,
checkForSuspensionViolation
is only called through upon the first step into a statement (if sqlite3_stmt_busy(sqliteStatement) == 0
), meaning that multi-step statements will be allowed to re-acquire the file lock even after the DB became suspended. While it seems unlikely that upsert is a multi-step statement, we don't use FTS5, so it still looks like the DB is executing a write after a suspense.The DB is shared between the main app and a notification service extension, which too sends
Database.suspendNotification
as appropriate.Could you please have a look and see if e.g.
checkForSuspensionViolation
shouldn't be called more often?