Long write operation locks reads in ValueObservation (depending on maximumReaderCount)

theMishania commented 1 month ago

What did you do?

Setup DatabasePool with config that has maximumReaderCount == 2 (can be any number)

Started long write operation in DatabasePool. Created 2 ValueObservations for read. Canceled (or not) them and started 3rd ValueObservation.

What did you expect to happen?

3rd ValueObservation performs read and publishes notifications about change in tracked region/records, etc because Pool's semaphore value drops down eventually (when another concurrent reads are performed) its value and doesn't block reads forever.

What happened instead?

3rd ValueObservation is blocked, because Pool.get() is locked by semaphore (itemsSemaphore). This read will never be executed while long write is still in progress. Even if previous ValueObservations was cancelled, 3rd ValueObservation does not starts publishing anything. stack trace shows that Thread is blocked by Pool.get(). Снимок экрана 2024-10-01 в 18 05 42

Environment

GRDB flavor(s): (GRDB, SQLCipher, Custom SQLite build?) GRDB version: 6.16.0 Installation method: (CocoaPods, SPM, manual?) manual Xcode version: 15.4 Swift version: 5.7 Platform(s) running GRDB: (iOS, macOS, watchOS?) macOS version running Xcode: 14.6.1

Demo Project

    func test_WriteLocksReads() throws {
        let configuration = Configuration()
        configuration.maximumReaderCount = 2
        let databasePool = try DatabasePool(
            path: path,
            configuration: configuration
        )

        var writeSubscriptions: Set<AnyCancellable> = []
        var readSubscriptions: Set<AnyCancellable> = []
        // write
        databasePool.writePublisher { _ in
            sleep(100000)
        }
        .sink()
        .store(in: &writeSubscriptions)

        let firstReadExpectation = XCTestExpectation()
        ValueObservation.tracking { database in
            defer { firstReadExpectation.fulfill() }
            return try SomeRecord.fetchOne(database, key: "record id")
        }
        .publisher(in: databasePool)
        .sink()
        .store(in: &readSubscriptions)

        wait(for: [firstReadExpectation])

        let secondReadExpectation = XCTestExpectation()
        ValueObservation.tracking { database in
            defer { secondReadExpectation.fulfill() }
            return try SomeRecord.fetchOne(database, key: "record id")
        }
        .publisher(in: databasePool)
        .sink()
        .store(in: &readSubscriptions)

        wait(for: [secondReadExpectation])

        readSubscriptions.removeAll() // can comment this line, doesn't change the problem

        let thirdReadExpectation = XCTestExpectation()
        ValueObservation.tracking { database in
            defer { thirdReadExpectation.fulfill() }
            return try SomeRecord.fetchOne(database, key: "record id")
        }
        .publisher(in: databasePool)
        .sink()
        .store(in: &readSubscriptions)

        wait(for: [thirdReadExpectation])
        XCTAssertTrue(true) // will not be executed until write's sleep(100000) is done
    }

groue commented 1 month ago

Hello @theMishania,

Thanks for the detailed report. Here is the explanation for the observed behavior:

When an observation is started from a DatabasePool, a reader connection is acquired from the reader pool, a read transaction (RT) is started, and the initial value is fetched and notified. But RT is not closed, and the reader connection is not put back in the pool yet. Instead, the observation waits for an access to the writer connection in order to check if some modifications were performed in the writer since RT was started. If so a new value is fetched from the writer and notified. Finally the real observation can start from the writer. This behavior was introduced in #1350, in order to solve #937.

In your test:

Observation 1 acquires a reader, and waits until the writer is available.
Observation 2 acquires a reader, and waits until the writer is available.
Observation 3 has no reader left until observation 1 or 2 could acquire the writer and release a reader.

theMishania commented 1 month ago

Hello @groue Thanks for so fast answer.

oh I see, sounds like only cancelling subscription that does not put back read connection in the pool is not expected behaviour, right? Or you have plans to value observation to put back reader connection in the pool?

groue commented 1 month ago

All readers are put back once a write access could be established and observation started.

I don't think cancelling the subscription cancels this phase where the observation waits for the writer before putting back the reader. This could be an interesting optimization to add. Does your app depend on this?

theMishania commented 1 month ago

Does your app depend on this?

if you asking about cancelling subscription - not much. I just don't find it clear enough that canceling subscription does not put back reader. The subscription is cancelled so there is not reason to wait for a write, imho

if you asking about this whole problem - very much yes. This observed behaviour makes it really hard to use long writes, ValueObservations and other reads (especially syncrhronous ones that blocks UI while write in progress) at the same time.

theMishania commented 1 month ago

for workaround I set maximumReaderCount to .max, but it seems not so safe to me.

groue commented 1 month ago

Does your app depend on this?

if you asking about cancelling subscription - not much. I just don't find it clear enough that canceling subscription does not put back reader. The subscription is cancelled so there is not reason to wait for a write, imho

It is not put back right on cancellation, to be precise. It is eventually put back, once the write access is acquired.

The optimization that puts back the reader on cancellation does not happen unless it is explicitly decided, and coded.

if you asking about this whole problem - very much yes. This observed behaviour makes it really hard to use long writes, ValueObservations and other reads (especially syncrhronous ones that blocks UI while write in progress) at the same time.

The behavior is only painful for apps that start a lot of observations at the same moment as a long write is performed.

This typically happens in apps that create an unbounded number of observations, such as one per item in a list. This is not performant, and can't be. Instead, the app should observe all items in the list in a single observation.

The default size of the pool is five. I never met an app that needs to raise this number.

groue commented 1 month ago

No news is good news: I guess it was possible to avoid launching an unbounded number of observations. We can close this issue.

groue / GRDB.swift