kabiroberai / node-swift

Create Node modules in Swift
MIT License
416 stars 11 forks source link

Improve concurrency support #7

Closed kabiroberai closed 2 years ago

kabiroberai commented 2 years ago

All NodeSwift APIs are currently thread-unsafe by default, except for NodeAsyncQueue. This means that NodeSwift doesn't operate very well with Swift 5.5 Concurrency (notably, NodeError is incorrectly marked @unchecked Sendable for now since otherwise it fails to compile; see #2).

We should utilize actors and potentially custom executors (once those are available) to make NodeSwift work well with Swift Concurrency. One potential solution could be to turn all NodeValue classes into actors, though we'd have to refactor the NodeObject inheritance chain for this, and also we'd have to figure out how to make the actor hop onto the required NodeAsyncQueue, which is where custom executors might come into play. Alternatively, all APIs could be made async, and internally hop onto a NodeAsyncQueue if needed.

kabiroberai commented 2 years ago

Something else I just realized: if we make all of our methods async, it'll be challenging to declare synchronous NodeFunctions, constructors, etc, since we'd need a way to guarantee that we won't hop threads midway. There are two solutions I can think of:

  1. Have sync and async overloads for every API; Swift's overload resolution will always pick the async variant in an async context and the sync variant in a non-async context. The async variants would work by hopping onto a task-local NodeAsyncQueue, making it safe to call them from any thread. HOWEVER, the sync variants would not have this affordance (since they are synchronous, after all), so it would be illegal to call them inside an async method, potentially creating problems:
func someFunc(obj: NodeObject) async throws -> NodeValueConvertible {
  await Task.sleep(nanoseconds: 1)
  await obj.blah() // okay, dispatches onto task's NodeAsyncQueue

  await Task.sleep(nanoseconds: 1)
  { obj.blah() }() // bad, can't hop onto queue!

  await Task.sleep(nanoseconds: 1)
  Task.detached { await obj.blah() } // also bad, no task-local queue!
}
  1. Make all APIs async, and rely on optimizations to avoid switching threads whenever possible. Internally, this might look something like
@Atomic let res: NodeValueConvertible?
let task = Task { 
  let value = await userMethod()
  res = value
  return value
}
if let value = res {
  // the task was inlined and completed synchronously, return its result directly
  return value
} else {
  // the task hasn't completed yet; we have to return a Promise
  return NodePromise { future in Task { future(await task) } }
}

However, this also faces issues because sometimes we need to guarantee that a method will do its work synchronously, for example in a setter:

class Foo: NodeClass {
  static let properties: NodeClassPropertyList = [
    "x": NodeComputedProperty(get: getX, set: setX)
  ]
  private var _x = 0
  func getX() async throws -> NodeValueConvertible {
    await Task.sleep(nanoseconds: 1)
    return _x
  }
  func setX(_ args: NodeArguments) async throws {
    await Task.sleep(nanoseconds: 1) // we'd be fine without this, since setX could run in-line
    _x = try await args[0].as(Int.self)!
  }
}

Since JS setters can't be asynchronous, this example would create plenty of strange issues:

let foo = new Foo();
foo.x = 5; // can't await here
await foo.x; // ...would this be 0 or 5?
kabiroberai commented 2 years ago

There's also another option:

  1. Annotate all Node methods with a pseudo-global actor so that we don't need to await if we're in that actor's context.

This would allow the user to create "synchronous" methods that can still call Node APIs:

@NodeActor func setX(_ args: NodeArguments) throws {
  // await Task.sleep(nanoseconds: 1) // won't compile; we're in a sync method
  _x = try args[0].as(Int.self)! // no need to await since we're on @NodeActor
}

While also allowing async methods:

func someFunc(obj: NodeObject) async throws -> NodeValueConvertible {
  await Task.sleep(nanoseconds: 1)
  await obj.blah() // have to await since we're not on @NodeActor
}

I say "pseudo"-global since there might actually be more than one Node thread/worker, so there's no such thing as "the one" global Node actor. This means that running code on NodeActor would still have to look up the task's local NodeAsyncQueue and schedule the code on it, so something like Task.detached would still fail to work. However, as long as the user maintains the task hierarchy, this would work fine. It would also still be possible to hop onto a different Node worker, by switching the task-local NodeAsyncQueue before doing so.

This is currently my favourite option, however I'm not sure if this is possible yet, because implementing @NodeActor might require custom executors.

kabiroberai commented 2 years ago

Managed to get option 3 working, see 11b81fd.