apple / swift-distributed-actors

Peer-to-peer cluster implementation for Swift Distributed Actors
https://apple.github.io/swift-distributed-actors/
Apache License 2.0
580 stars 54 forks source link

EXC_BAD_ACCESS when running on macOS 13 and Xcode 15b #1136

Open akbashev opened 11 months ago

akbashev commented 11 months ago

Description Getting EXC_BAD_ACCESS when running on macOS 13 and Xcode 15b6 (15A5219j) specifically.

Runs perfectly on: macOS 13 + Xcode 14 macOS 13 + Xcode 14 on Swift 5.9 DEV SNAPSHOT toolchain macOS 14 + Xcode 15b6 (15A5219j)

Steps to reproduce Run any simple project, e.g. https://github.com/akbashev/WorkerPoolTest

Environment macOS 13.4.1, Xcode 15.0 Beta 6 (15A5219j)

Backtrace

* thread #14, stop reason = EXC_BAD_ACCESS (code=1, address=0x10)
    frame #0: 0x00000001003b90e4 WorkingPoolTest`_ActorRef.asAddressable.getter(self=DistributedCluster._ActorRef<τ_0_0> @ 0x000000017050da48) at Refs.swift:0
  * frame #1: 0x00000001002f5a68 WorkingPoolTest`DeathWatchImpl.isWatching(id=$s18DistributedCluster0B6SystemC7ActorIDVD @ 0x000000017050dbf0, self=DistributedCluster.DeathWatchImpl<DistributedCluster.ClusterShell.Message> @ 0x000000017050edf8) at _BehaviorDeathWatch.swift:204:112
    frame #2: 0x00000001002f51fc WorkingPoolTest`DeathWatchImpl.watch<τ_0_0>(watchee=DistributedCluster._ActorRef<DistributedCluster.Cluster.Event> @ 0x000000017050fba0, terminationMessage=nil, watcher=0x0000000102607740, file="/Users/jaleel/Library/Developer/Xcode/DerivedData/WorkerPoolTest-blwqflcbayodcthckaqqfuadcfvt/SourcePackages/checkouts/swift-distributed-actors/Sources/DistributedCluster/Cluster/ClusterShell.swift", line=393, self=DistributedCluster.DeathWatchImpl<DistributedCluster.ClusterShell.Message> @ 0x0000000102705e90) at _BehaviorDeathWatch.swift:161:17
    frame #3: 0x0000000100441d28 WorkingPoolTest`_ActorShell.watch<τ_0_0>(watchee=DistributedCluster._ActorRef<DistributedCluster.Cluster.Event> @ 0x000000017050fba0, terminationMessage=nil, file="/Users/jaleel/Library/Developer/Xcode/DerivedData/WorkerPoolTest-blwqflcbayodcthckaqqfuadcfvt/SourcePackages/checkouts/swift-distributed-actors/Sources/DistributedCluster/Cluster/ClusterShell.swift", line=393, self=0x0000000102607740) at _ActorShell.swift:654:25
    frame #4: 0x00000001000f2188 WorkingPoolTest`closure #1 in ClusterShell.bind(context=0x0000000102607740, self=0x000000010301a400) at ClusterShell.swift:393:25
    frame #5: 0x0000000100124e34 WorkingPoolTest`partial apply for closure #1 in ClusterShell.bind() at <compiler-generated>:0
    frame #6: 0x0000000100072414 WorkingPoolTest`start0 #1 <τ_0_0>(behavior=DistributedCluster._Behavior<DistributedCluster.ClusterShell.Message> @ 0x0000000170512388, depth=0, failAtDepth=128, context=0x0000000102607740) in _Behavior.start(context:) at Behaviors.swift:905:55
    frame #7: 0x000000010006f438 WorkingPoolTest`_Behavior.start(context=0x0000000102607740, self=DistributedCluster._Behavior<DistributedCluster.ClusterShell.Message> @ 0x00000001705129b8) at Behaviors.swift:923:20
    frame #8: 0x000000010041d774 WorkingPoolTest`Supervisor.interpretSupervised0(target=DistributedCluster._Behavior<DistributedCluster.ClusterShell.Message> @ 0x0000000170513bc8, context=0x0000000102607740, processingAction=start, nFoldFailureDepth=1, self=0x0000600000202d40) at Supervision.swift:467:35
    frame #9: 0x000000010041c224 WorkingPoolTest`Supervisor.interpretSupervised0(target=DistributedCluster._Behavior<DistributedCluster.ClusterShell.Message> @ 0x0000000170513c78, context=0x0000000102607740, processingAction=start, self=0x0000600000202d40) at Supervision.swift:451:18
    frame #10: 0x000000010041d48c WorkingPoolTest`Supervisor.startSupervised(target=DistributedCluster._Behavior<DistributedCluster.ClusterShell.Message> @ 0x0000000170514380, context=0x0000000102607740, self=0x0000600000202d40) at Supervision.swift:442:25
    frame #11: 0x000000010043a99c WorkingPoolTest`_ActorShell.interpretStart(self=0x0000000102607740) at _ActorShell.swift:499:43
    frame #12: 0x000000010043a074 WorkingPoolTest`_ActorShell.interpretSystemMessage(message=start, self=0x0000000102607740) at _ActorShell.swift:303:22
    frame #13: 0x00000001004582d4 WorkingPoolTest`_Mailbox.mailboxRun(shell=0x0000000102607740, self=0x000060000390d110) at _Mailbox.swift:378:43
    frame #14: 0x0000000100453ad8 WorkingPoolTest`_Mailbox.run(self=0x000060000390d110) at _Mailbox.swift:319:37
    frame #15: 0x00000001004566f4 WorkingPoolTest`implicit closure #4 in implicit closure #3 in _Mailbox.sendSystemMessage(self=0x000060000390d110) at _Mailbox.swift:215:44
    frame #16: 0x00000001000982c0 WorkingPoolTest`thunk for @escaping @callee_guaranteed () -> () at <compiler-generated>:0
    frame #17: 0x0000000100098c88 WorkingPoolTest`thunk for @escaping @callee_guaranteed () -> (@out ()) at <compiler-generated>:0
    frame #18: 0x00000001002b13b8 WorkingPoolTest`closure #1 in _FixedThreadPool.init(self=0x0000600001701a80, worker=0x0000600000c031b0) at _FixedThreadPool.swift:78:25
    frame #19: 0x00000001002b2510 WorkingPoolTest`closure #1 in _Thread.init(lock=0x000060000210cbe0, isRunning=(_storage = Swift.Bool.AtomicRepresentation @ 0x0000600000235310), f=0x00000001002b1920 WorkingPoolTest`partial apply forwarder for closure #1 () -> () in DistributedCluster._FixedThreadPool.init(Swift.Int) throws -> DistributedCluster._FixedThreadPool at <compiler-generated>) at _Thread.swift:54:13
    frame #20: 0x00000001002b31c4 WorkingPoolTest`closure #1 in static _Thread.runnerCallback.getter(arg=0x600000235320) at _Thread.swift:137:45
    frame #21: 0x00000001002b3230 WorkingPoolTest`@objc closure #1 in static _Thread.runnerCallback.getter at <compiler-generated>:0
    frame #22: 0x000000010207555c libsystem_pthread.dylib`_pthread_start + 148
akbashev commented 11 months ago

Note that I'm not getting any crashes on Xcode 14 + latest Swift 5.9 dev snapshot. So could be problem on Swift side and already fixed. 🤔

ktoso commented 11 months ago

Thanks I'll see if I can track that down - we fixed a missing lock around there, maybe I missed something

akbashev commented 10 months ago

Just FYI, checked latest Xcode 15b7 15A5229h and error is still there.

akbashev commented 8 months ago

Hm, not sure how to test it now as all of my devices been updated to macOS 14 🥲 Will keep it for a while and then can be closed I think...

yaglo commented 7 months ago

Getting the same error when running from the command line but not when running from Xcode.

yaglo commented 7 months ago

After some digging I've found that it's the <Never> that is expected to be a Codable for _ActorRef<Never> and _ResolveContext<Never> is causing the issue.

If you add extension Never: Codable {} anywhere in DistributedCluster, the problem goes away on macOS 13 (13.6.2 in my case).

My guess is that even though SE-0396 was implemented in Swift 5.9, the runtime on macOS 13 doesn't have it, so it leads to undefined behaviour and the program crashing.