steamclock / netable

A Swift library for encapsulating network APIs using Codable in a type-oriented way.
MIT License
99 stars 3 forks source link

Rework networking to use async/await #99

Closed brendanlensink closed 1 year ago

brendanlensink commented 2 years ago

For #77

There's still some tinkering I'd like to do, but this seemed like a good spot to solicit some feedback before going any further. I think there's still a fair amount of improving to do making sure things are thread safe (especially returning to the main thread), but that's a WIP. Also, documentation and examples could use a pass over them, whether it's in a separate PR or not.

This PR switches the fundamental networking in Netable to use async/await. I've left in support for networking with Combine or callbacks, but they're now wrappers on the async/await functionality instead of the other way around.

In doing this, I've had to rewrite or change some things like retries and cancellation, and would be most interested in having another pair of eyes look over those.

nbrooke commented 1 year ago

The changes look good, but the logOnMainThread called in cancelAllTasks got me thinking, and I think there might be a higher level thread safety issue here.

Originally I was wondering cancellAllTasks should actually be async as well and/or if it was thread safe with regards to being run concurrently with request. I think it probably SHOULD be async (though not sure it matters TOO much), but I think it is actually thread safe (assuming URLSession is fully thread safe, which it sounds like it is).

However, I think there is a subtle thread safety problem with the various configuration properties. I was originally thinking that request couldn't have any thread safety problems because it wasn't modifying any state (other than URLSession, which should be thread safe itself), but it is READING a bunch of state (baseURL, headers, etc.) that although it's easy to think of that as just being immutable, since it often is from a logical standpoint, those are actually vars and could be changed. I think that any code like the following would be unsafe:

Task {
    await netable.request(...)
}

netable.baseURL = newURL

That, to be clear, is not GOOD code, even if this isn't unsafe, it's always going to be ambiguous if the request will use the old or new URL, but I think it is actually unsafe, and could theoretically end up with the request using a mangled / partially updated configuration element (or at least could be for some types, not sure about URL specifically), rather than just using either the old or new URL.

I think to fix this, we need to EITHER make all of the configuration immutable (and ensure that all the configuration types themselves are thread safe as well, which I'm pretty sure they just are since are all primitives or simple structs), or we need to make Netable an actor, so that sets of the properties and running request will be on the same actor and thus sad to trigger concurrently.

Making them immutable would ALSO remove the possibility of writing bad code like the above (whereas wth an actor you still could, though would have to work a little harder to do it), and I think keeps the implementation of the callback and Combine request functions simpler (whereas on an actor, those would need to be nonisolated functions, which I believe would complicate their implementations considerably, or maybe not even work at all. So making that immutable seems like the way to go if we can make it work, we just need to evaluate how much disruption that's actually going to cause.

It also might be worth putting a bit of effort into figuring out WHY we don't seem to be getting any warnings about this. This seems like the sort of thing that the concurrency checking should have caught, and it doesn't look like we've done anything to claim that a code is thread safe when it isn't (like adding Sendable conformance to types that are not actually thread safe). I know that not everything is turned on there, and there are various compatibility hacks, so maybe we could be enabling some compiler flags that we aren't or something to start checking concurrency in a stricter way.

brendanlensink commented 1 year ago

@nbrooke I made the following changes:

I'm going to spend some more time investigating the concurrency checking as well, I'm also overdue for updating to Xcode 14 so some of that might come with that update.

nbrooke commented 1 year ago

I think requestFailureDelegate WOULD have the same problem I was imagining, even though it doesn't have any visible shared state, even swapping in a new instance both will cause a race condition, and could crash under the right circumstances, i.e.

netable.requestFailureDelegate = a

Task {
    // some netable function that calls 
    // requestFailureDelegate.requestDidFail(/*...*/)
}

netable.requestFailureDelegate = b

is both a race condition on whether the call will use a or b, and, under the right circumstances (but, I now think, not THESE exact circumstances) could crash.

So I think it does make sense to make it immutable as well for an async world.

Rest of this comment probably not true, see below

HOWEVER, I did a bit more reading on Sendable and Task isolation, and I'm now back to thinking that the original version of things was maybe actually in practice okay from a thread safety standpoint, and I think I understand why we weren't getting any compiler errors

The key thing I was misunderstanding is that Netable is in fact NOT thread safe, but it doesn't matter, because the way we are using it in the sample (and probably in our apps as well) doesn't require it to be thread safe because everything is limited to a single execution context. I misunderstood what Task does, it creates a new async execution context, but crucially, does NOT actually create it in a concurrent way, all of the code in the previous examples I gave is still executing on the same thread because the task inherits all context from it's parent. I sort of knew that, but hadn't totally got that in this case it meant we were ALWAYS in the same execution context.

If we wrote the following (which does create a real concurrent context):

let netable = Netable(/*...*/)

Task.detached {
    await netable.request(...) // likely compile error here
}

we would be creating thread unsafe code, but here the compiler checking would kick in and (I believe would) get an error on the indicated line because Netable doesn't conform to "Sendable" so can't actually be sent to a genuinely concurrent context created by Task.detached. And if we did try and conform it to Sendable, we would get an bunch of warnings about any mutable state that was not protected.

It might eventually be useful to make Netable genuinely thread safe and Sendable, so we can potentially use it in parallel from different actors and so on, but it's probably better not to worry about that for now. I still think making everything immutable is a good idea, because it prevents the race condition in any code like the above.

The realization has two interesting follow ons:

To fix the second problem, I think we need to actually make Request conform to Sendable (to guarantee that the decode and finale functions can be called on a background thread), and probably run the decode from a Task.detached call, or maybe using async let. I'm having quite a bit of trouble figuring out how to actually structure that code to force it to run on a background thread, but suspend the main task while it is working.

nbrooke commented 1 year ago

I think the incantation for forcing the decode to get done on the background thread is something along the lines of replacing:

let decoded = try await request.decode(data, defaultDecodingStrategy: self.config.jsonDecodingStrategy)
let finalizedResult = try await request.finalize(raw: decoded)

with

let strategy = self.config.jsonDecodingStrategy // don't want to try and capture self, or it needs to be Sendable
let finalizedResult = Task.detached {
    let decoded = try await request.decode(data, defaultDecodingStrategy: strategy)
    return try await request.finalize(raw: decoded)
}.value

Like I said, I believe request will need to be Sendable for that to work.

Getting a bit worried there is still something I'm not understanding here though, because in trying to build a quick playground to verify the above, I also tried checking if code like the detached version I had before WOULD produce a compiler error, and I couldn't make it do that, so I feel like there still some wrinkle I'm missing.

brendanlensink commented 1 year ago

I feel like I'm definitely missing something here...

"I think this means that the @MainActor annotations in the current code are unnecessary. "

I don't think this is true, if I run the app with the following change it complains about UI updates happening outside the main thread:

    func getPosts() {
        Task { // @MainActor in OMITTED HERE //
            do {
                let posts = try await self.netable.request(GetPostsRequest())
                self.posts.send(posts)
            } catch {
                print(error)
            }
        }
    }
nbrooke commented 1 year ago

Yeah, that for sure points to something we are still missing here. The documentation seems to vaguely support the idea that should work, in particular the documentation for the task constructor being called there (https://developer.apple.com/documentation/swift/task/init(priority:operation:)-5ltye) specifically says inherits the priority and actor context of the caller which definitely sounds to me like those tasks maybe should be created on the main actor if the calling code is on the main thread.

Maybe when you are on the main thread but in a synchronous context the actor context is "nothing" rather than being "main actor". so those top level tasks that the app is creating ARE actually independent (If that's the case though, then it seems like were back to "this shouldn't be thread safe"), or maybe there is no actor context here because Netable isn't an actor.

Very confusing.

nbrooke commented 1 year ago

Okay, getting closer to understanding this, this thread here: https://developer.apple.com/forums/thread/710646 is really useful, someone else grappling with similar issues and a solid response from Apple developer relations.

My takeaway from that thread:

nbrooke commented 1 year ago

Semi-related, I'd missed before, but noticed while digging around in Netable trying some stuff out for this, but the global request failure delegate and subject are only updated when using the legacy functions, we should probably either make that work in the async version, or mark those as deprecated as well. If we do make them work, they might need some tweaks (delegate should probably be async, and need to think about that thread things are dispatching on for the publisher.

brendanlensink commented 1 year ago

WRT global request failure delegate/subjects, I think I'm a little more inclined to mark them as deprecated or just remove them. I think with our current architectures they make less sense than when we put them in, and atleast personally I haven't needed them any time recently.

nbrooke commented 1 year ago

Some kind of global failure channel seems like a pretty good fit for the ErrorService stuff in the mvvm architecture / utility belt library, so I don't think our newer approaches to architecture necessarily invalidate this.

Those global channels have always struct me as more potentially useful from a logging standpoint than from a behavioural standpoint (with the possible exception of automatic logout on any 401 errors) so I think it'd be tough for us to "need" them to make some architecture work. Question is, have we not been using them because it just didn't occur to us, and it would actually be useful, but not so useful to naturally force them into action, or is it actually NOT suitable for implementing the sort of things we thought it might be (global error logging, global error reporting to the user via popups/toasts/whatever, universal logout on 401s).

brendanlensink commented 1 year ago

Updated a bunch of things, including just trying out moving Netable to an actor? Seems like there was enough things cropping up that it seems worth trying, and maybe a little easier than expected?

brendanlensink commented 1 year ago

@nbrooke I took another pass through things with the Xcode 14 concurrency checking turned on and handled all but one of the warnings. As we suspected it's mostly just being explicit about what is/isn't sendable all over the place. Let me know what you think.

brendanlensink commented 1 year ago

image