swiftlang / swift

The Swift Programming Language
https://swift.org
Apache License 2.0
66.86k stars 10.3k forks source link

Distributed actors: Remote call crash with a function having generic protocol with associated type #74769

Open akbashev opened 1 week ago

akbashev commented 1 week ago

Description

Basically this is a copy of an issue in swift-distributed-actors repo: https://github.com/apple/swift-distributed-actors/issues/1156

I've just realised, that this could be a Swift runtime issue.

This is very specific, but still a bit surprising. When using a protocol with associated type, which will be used then as a generic (see reproducible for example)—cluster will crash with error:

Thread 1: Fatal error: Error raised at top level: DistributedCluster.GenericRemoteCallError(message: "Remote call error of [ExecuteDistributedTargetError] type occurred")

Reproduction

you need ClusterSystem package (swift-distributed-actors)

import Distributed
import DistributedCluster

typealias DefaultDistributedActorSystem = ClusterSystem

protocol SomeProtocol: Codable {
  associatedtype Message: Codable
}

struct Implementation: SomeProtocol {
  typealias Message = String
}

distributed actor WorkerActor {

  static var key: DistributedReception.Key<WorkerActor> { "worker_actor" }

  distributed func getSome<S: SomeProtocol>(_ protocol: S, message: S.Message) -> String {
    "help"
  }

  init(actorSystem: ClusterSystem) async {
    self.actorSystem = actorSystem
    await self.actorSystem.receptionist.checkIn(self, with: WorkerActor.key)
  }
}

distributed actor ReceptionistActor {

  var actors: [WorkerActor] = []

  func listen() {
    Task {
      for await actor in await actorSystem.receptionist.listing(of: WorkerActor.key) {
        self.actors.append(actor)
      }
    }
  }

  distributed func getLatest() -> WorkerActor? {
    self.actors.last
  }

  init(actorSystem: ActorSystem) async {
    self.actorSystem = actorSystem
    self.listen()
  }
}

let receptionist = await ClusterSystem("receptionist")
let someActorsNode = await ClusterSystem("spme_actors") { settings in
  settings.bindPort = 1111
}

receptionist.cluster.join(endpoint: someActorsNode.settings.endpoint)
try await receptionist.cluster.joined(node: someActorsNode.cluster.node, within: .seconds(10))

let receptionistActor = await ReceptionistActor(actorSystem: receptionist)
let worker = await WorkerActor(actorSystem: someActorsNode)
// local actor works
try await print(worker.getSome(impl, message: "hello"))
// wait a bit for receptionist
try await Task.sleep(for: .seconds(3))
let remoteActor = try await receptionistActor.getLatest()
let impl = Implementation()
// Remote actor will crash
try await print(remoteActor?.getSome(impl, message: "hello") ?? "")
print("done")

Stack dump

* thread #1, queue = 'com.apple.main-thread', stop reason = Fatal error: Error raised at top level: DistributedCluster.GenericRemoteCallError(message: "Remote call error of [ExecuteDistributedTargetError] type occurred")
  * frame #0: 0x00000001b02b1890 libswiftCore.dylib`_swift_runtime_on_report
    frame #1: 0x00000001b0370b10 libswiftCore.dylib`_swift_stdlib_reportFatalErrorInFile + 208
    frame #2: 0x00000001aff42d94 libswiftCore.dylib`closure #1 (Swift.UnsafeBufferPointer<Swift.UInt8>) -> () in closure #1 (Swift.UnsafeBufferPointer<Swift.UInt8>) -> () in Swift._assertionFailure(_: Swift.StaticString, _: Swift.String, file: Swift.StaticString, line: Swift.UInt, flags: Swift.UInt32) -> Swift.Never + 104
    frame #3: 0x00000001aff41ed8 libswiftCore.dylib`Swift._assertionFailure(_: Swift.StaticString, _: Swift.String, file: Swift.StaticString, line: Swift.UInt, flags: Swift.UInt32) -> Swift.Never + 260
    frame #4: 0x00000001affe3628 libswiftCore.dylib`swift_errorInMain + 636
    frame #5: 0x000000010000cd38 WorkingPoolTest`async_MainTY19_ at <compiler-generated>:0
    frame #6: 0x000000010000cdc0 WorkingPoolTest`thunk for @escaping @convention(thin) @async () -> () at <compiler-generated>:0
    frame #7: 0x000000010000cee8 WorkingPoolTest`partial apply for thunk for @escaping @convention(thin) @async () -> () at <compiler-generated>:0

Expected behavior

Remote actor call should behave the same as when local with associated types in protocol.

Environment

swift-driver version: 1.90.11.1 Apple Swift version 5.10 (swiftlang-5.10.0.13 clang-1500.3.9.4) Target: arm64-apple-macosx14.0

Additional information

Note—everything will start working again if you'll change associated type to separate protocol, like:

protocol SomeProtocol: Codable {}
protocol Message: Codable {}
extension String: Message {}

and update function accordingly:

distributed func getSome<S: SomeProtocol, M: Message>(_ protocol: S, message: M) -> String {
    "help"
}
ktoso commented 1 week ago

Thank you for the report! It is definitely possible we don't handle this well