Interrupted connections finish async sequences, addresses discussion #122

jakaplan commented 1 year ago

For a given XPCClient all active async sequences are terminated with an error if the connection with the server is interrupted. This is the correct behavior because no more responses can possibly be received for this given sequence (although a new requests could be made and might succeed if the server is successfully started once again).

jakaplan commented 1 year ago

@vincent-le-normand I'd appreciate if you could try out this branch and let me know if it addresses your feedback.

jakaplan commented 1 year ago

@vincent-le-normand For more context, I can't commit this as-is because it fails in exactly the way I was worried it would. But before I go a lot deeper and see if I can find a full solution, I wanted to understand if this is addressing your feedback.

vincent-le-normand commented 1 year ago

Hello,

Yes, this is exactly what I'd need. I test it a little bit more next week, because I got a crash once - but it may be my fault.

Thanks!

Le 18 nov. 2022 à 13:19, Josh Kaplan @.***> a écrit :

@vincent-le-normand For more context, I can't commit this as-is because it fails in exactly the way I was worried it would. But before I go a lot deeper and see if I can find a full solution, I wanted to understand if this is addressing your feedback.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

jakaplan commented 1 year ago

@vincent-le-normand I was able to resolve the race conditions. The resulting API changes are a breaking change, so it may be a bit until I make a new official release with this. However, before any of that is relevant I'll await your further testing this upcoming week.

vincent-le-normand commented 1 year ago

It seems to work, except for one thing. When creating multiple anonymous connection (something we do in unit tests), we are receiving disconnection inappropriately. Here is a small test that is throwing (but shouldn't I think).

    func testInterruptions() async throws {
        for maxValue in 100 ..< 1000 {
            let route = XPCRoute.named("sequentialRoute").withMessageType(Int.self).withSequentialReplyType(Int.self)
            let server = XPCServer.makeAnonymous()
            let endpoint = server.endpoint
            let client = XPCClient.forEndpoint(endpoint)

            server.registerRoute(route) { maxValue, resultProvider in
                do {
                    for value in 0 ..< maxValue {
                        try await Task.sleep(nanoseconds: 1_000)
                        try await resultProvider.success(value: value)
                    }
                }
                catch {
                    try? await resultProvider.failure(error: error)
                }
            }

            server.start()

            let asyncStream = client.sendMessage(maxValue, to: route)
            _ = try await asyncStream.first { _ in false }
        }
    }

jakaplan commented 1 year ago

@vincent-le-normand I looked at this briefly and can replicate this. You may be correct that there's a subtle race condition here which I'll need to investigate more. However, the code you've written maximizes the chance of this due to never finishing a sequence. If I add: try await resultProvider.finished() after:

for value in 0 ..< maxValue {
    try await Task.sleep(nanoseconds: 1_000)
    try await resultProvider.success(value: value)
}

then the test always passes for me. This is not to say a bug does not exist within the SecureXPC package.

I don't expect to have more time to investigate this week as I'm on holiday in New Caledonia, but should have time next week to look into this further.

vincent-le-normand commented 1 year ago

Thanks for the reply.

You're right, it's much better when adding .finished after the for loop.

According to what I understand of the code, the sequence should be automatically finished when the resultProvider is dented. So the race condition only happens in this case - at least for this particular example.

No rush here. It's really appreciated that you take a look to this feature.

I'll probably have a couple of other feature i would like to add - ill make PR for them.

One is related to testing this interrupted server case. It would be nice to be able to programatically interrupt a XPCServer.

Another one is not directly related, but it would be useful to bu able to be notified globally when interruption handler is called. For instance, you can adapt the user interface if a daemon crashes or is unloaded by new ventura settings.

Enjoy your vacations.

Vincent

Le 23 nov. 2022 à 13:19, Josh Kaplan @.***> a écrit :

@vincent-le-normand I looked at this briefly and can replicate this. You may be correct that there's a subtle race condition here which I'll need to investigate more. However, the code you've written maximizes the chance of this due to never finishing a sequence. If I add: try await resultProvider.finished() after:

for value in 0 ..< maxValue { try await Task.sleep(nanoseconds: 1_000) try await resultProvider.success(value: value) }

then the test always passes for me. This is not to say a bug does not exist within the SecureXPC package.

I don't expect to have more time to investigate this week as I'm on holiday in New Caledonia, but should have time next week to look into this further.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

jakaplan commented 1 year ago

According to what I understand of the code, the sequence should be automatically finished when the resultProvider is dented. So the race condition only happens in this case - at least for this particular example.

Ah, thank you for this comment - this made me realize the issue.

These are not equivalent operations. When the result provider is deinitialized a message is sent from the server to the client to finish the sequence, but no ordering is guaranteed server side as nothing is awaited. Whereas calling try await resultProvider.finished() means that the server code is waiting for client confirmation before proceeding.

Essentially what's happening is that most of the time the sequence is being finished via deinit before the connection terminates, but not always - as neither ARC nor XPC make any guarantees in regards to timing. When the sequence isn't "automatically" finished prior to the connection terminating (from the perspective of the client) then the sequence is interrupted. It always was being interrupted, but without this PR the sequence would not be notified of this error.

I'm back from my holiday now and should have sometime in the next few days to look into this more and hopefully figure out a workable API design. My initial thoughts right now are that it will involve removing the deinit behavior and requiring explicit finishing by the API user.

jakaplan commented 1 year ago

@vincent-le-normand I believe I've found and resolved the issue. With the latest changes, I can run your unit test without issue. Let me know if anything continues to not behave as expected.

(What I described above was correct in describing the problem, but it turns out there was actually a much more straightforward solution than I had anticipated - there was actually a bug in terms of how deinit's finishing of a sequence behaved.)

vincent-le-normand commented 1 year ago

Everything seems to work as I would expect so far. Thanks a lot!

trilemma-dev / SecureXPC

Interrupted connections finish async sequences, addresses discussion #122 #123