urbanairship / ios-library

Urban Airship iOS SDK
http://urbanairship.com
Apache License 2.0
478 stars 265 forks source link

💥 Crash in AnyPublisher.receive<A> #418

Open vgritsenko opened 2 days ago

vgritsenko commented 2 days ago

Preliminary Info

What Airship dependencies are you using?

Airship version 18.7.2 with Core and Automation.

pod 'Airship', '18.7.2', :subspecs => ['Core', 'Automation'], :linkage => :static

What are the versions of any relevant development tools you are using?

Xcode 15, CocaPods 1.15.2 with two plugins.

plugin 'cocoapods-pod-linkage'
plugin 'cocoapods-wholemodule'

Report

What unexpected behavior are you seeing?

Seeing crashes originating from airshipWatchActivities(activityBlock:) function of LiveActivity.swift file, on iPhone and iPad devices, running various iOS 17 and iOS 18 OS versions.

What is the expected behavior?

There should be no crashes.

What are the steps to reproduce the unexpected behavior?

Unknown.

Do you have logging for the issue?

Stack trace of the crashed thread:

Thread 3 Crashed:
0   Combine                         0x00000001a7c3568c AnyPublisher.receive<A>(subscriber:) + 40 (AnyPublisher.swift:80)
1   Combine                         0x00000001a7c834b8 Publisher.sink(receiveCompletion:receiveValue:) + 320 (Sink.swift:39)
2   SessionFoundation               0x00000001c2926828 BufferedAsyncSequence.Iterator.next() + 348 (BufferedAsyncSequence.swift:123)
3   SessionFoundation               0x00000001c2928f99 dispatch thunk of BufferedAsyncSequence.Iterator.next() + 1
4   ActivityKit                     0x000000025a3aa551 Activity.PushTokenUpdates.Iterator.next() + 1 (Activity.swift:1093)
5   [Airship-Linked-Lib]            0x000000010841ccc5 closure #2 in static Activity.airshipWatchActivities(activityBlock:) + 1 (LiveActivity.swift:221)
6   [Airship-Linked-Lib]            0x000000010841e7d1 partial apply for closure #2 in static Activity.airshipWatchActivities(activityBlock:) + 1
7   [Airship-Linked-Lib]            0x000000010809f78d specialized thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) + 1 (<compiler-generated>:0)
8   [Airship-Linked-Lib]            0x000000010809fd8d partial apply for specialized thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) + 1
9   libswift_Concurrency.dylib      0x00000001aa005e19 completeTaskWithClosure(swift::AsyncContext*, swift::SwiftError*) + 1 (Task.cpp:497)
rlepinski commented 2 days ago

Are you using airshipWatchActivities block? Do you have the full crash report including the reason/error code? Any metrics on how often you are seeing this crash? (percent, count, total audience)

vgritsenko commented 1 day ago

Hi @rlepinski, thanks for taking a look. Yes, we are using that block, from a function within the app:

    Activity<T>.airshipWatchActivities { activity in
        let name = activity.attributes.name
        Airship.channel.trackLiveActivity(activity, name: name)
        updateActivities(named: name)
    }

updateActivities(named:) function there starts a Task that enumerates activities and calls end(nil, dismissalPolicy: .immediate) on duplicate activities, if any. We do not have any live activities running now, though, which means that this crash happens even if there aren't actually any activities in the app.

I have multiple reports, and here is the top portion snippet out of one of the reports:

Incident Identifier: E01D04B3-F53C-4522-98D6-620E83DBFB77
Distributor ID:      com.apple.AppStore
Hardware Model:      iPhone17,3
Process:             [...process...] [738]
Path:                /private/var/containers/Bundle/Application/[...path...]
Identifier:          [...bundle id...]
Version:             [... app version...]
AppStoreTools:       16A242d
AppVariant:          1:iPhone17,3:18
Code Type:           ARM-64 (Native)
Role:                unknown
Parent Process:      launchd [1]
Coalition:           [...bundle id...] [967]

Date/Time:           2024-10-06 12:36:51.7921 +0100
Launch Time:         2024-10-05 23:35:55.2449 +0100
OS Version:          iPhone OS 18.0.1 (22A3370)
Release Type:        User
Baseband Version:    1.00.00
Report Version:      104

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Subtype: KERN_INVALID_ADDRESS at 0x0000812eb4a91050 -> 0x0000012eb4a91050 (possible pointer authentication failure)
Exception Codes: 0x0000000000000001, 0x0000812eb4a91050
VM Region Info: 0x12eb4a91050 is not in any region.  Bytes after previous region: 819074764881  
      REGION TYPE                 START - END      [ VSIZE] PRT/MAX SHRMOD  REGION DETAIL
      commpage (reserved)     1000000000-7000000000 [384.0G] ---/--- SM=NUL  reserved VM address space (unallocated)
--->  
      UNUSED SPACE AT END
Termination Reason: SIGNAL 11 Segmentation fault: 11
Terminating Process: exc handler [738]

Triggered by Thread:  3

As for the metrics, Firebase reports that during last 30 days in this issue is in the list of top 5 issues, which is high enough on the list to warrant attention. I see that this crash has happened about 100 times on the audience of about 2m users and about 50m sessions.

rlepinski commented 1 day ago

As for the metrics, Firebase reports that during last 30 days in this issue is in the list of top 5 issues, which is high enough on the list to warrant attention. I see that this crash has happened about 100 times on the audience of about 2m users and about 50m sessions.

Absolutely, it just helps us determine if we have any hope of reproducing or the type of issue. That is pretty low crash rate so probably don't have any chance or reproducing.

Usually with these type of issues with low crash rates its some sort of memory retention issue when the app is tearing down. Here is the extension, its not retaining anything but the block:

@available(iOS 16.1, *)
extension Activity where Attributes : ActivityAttributes {

    /// Calls `checkActivity` on every active activity on the first call and on each `pushToStartTokenUpdates` update.
    /// - Parameters:
    ///     - activityBlock: Block that is called with the activity
    public class func airshipWatchActivities(activityBlock: @escaping @Sendable (Activity<Attributes>) -> Void) {
        let checkHelper: @Sendable () -> Void = {
            self.activities.filter { activity in
                if #available(iOS 16.2, *) {
                    return activity.activityState == .active || activity.activityState == .stale
                } else {
                    return activity.activityState == .active
                }
            }.forEach { activity in
                activityBlock(activity)
            }
        }

        Task {
            checkHelper()
            if #available(iOS 17.2, *) {
                for await _ in Activity<Attributes>.pushToStartTokenUpdates {
                    checkHelper()
                }
            }
        }
    }
}

self here is the class, so probably not the issue. I can rewrite it to not use the internal block, it would make the stack traces easier to read:

@available(iOS 16.1, *)
extension Activity where Attributes : ActivityAttributes {

    fileprivate class func _airshipCheckActivities(activityBlock: @escaping @Sendable (Activity<Attributes>) -> Void) {
        self.activities.filter { activity in
            if #available(iOS 16.2, *) {
                return activity.activityState == .active || activity.activityState == .stale
            } else {
                return activity.activityState == .active
            }
        }.forEach { activity in
            activityBlock(activity)
        }
    }

    /// Calls `checkActivity` on every active activity on the first call and on each `pushToStartTokenUpdates` update.
    /// - Parameters:
    ///     - activityBlock: Block that is called with the activity
    public class func airshipWatchActivities(activityBlock: @escaping @Sendable (Activity<Attributes>) -> Void) {
        Task {
            _airshipCheckActivities(activityBlock: activityBlock)
            if #available(iOS 17.2, *) {
                for await _ in Activity<Attributes>.pushToStartTokenUpdates {
                    _airshipCheckActivities(activityBlock: activityBlock)
                }
            }
        }
    }
}

I wonder if its a retention to what I assume might be your app delegate in your block:

   Activity<T>.airshipWatchActivities { activity in
        let name = activity.attributes.name
        Airship.channel.trackLiveActivity(activity, name: name)
        updateActivities(named: name)
    }

Could you try rewriting that to:

   Activity<T>.airshipWatchActivities { [weak self] activity in
        let name = activity.attributes.name
        Airship.channel.trackLiveActivity(activity, name: name)
        self?.updateActivities(named: name)
    }

How about we try both and see if the crash goes away?

vgritsenko commented 1 day ago

Could you try rewriting that to [weak self]

Sorry... I snipped too much, I guess, but this is within a context of static function declared on a struct:

public struct AnyActivityType<T: [MyActivityAttributes]>: AnyActivity {
    public static func track() {
        Activity<T>.airshipWatchActivities { activity in
            let name = activity.attributes.name
            Airship.channel.trackLiveActivity(activity, name: name)
            updateActivities(named: name)
        }
    }

    private static func updateActivities(named name: String) {
        Task { @MainActor in
            ... logic ...
        }
    }
}

So it was a good call, but [weak self] wouldn't apply for this context.

How about we try both and see if the crash goes away?

Sure, let's try that. Thank you.

rlepinski commented 1 day ago

If everything is static/class then I don't see how we have a memory retention issue. Ill do the block rewrite in the next patch and we can see if that changes anything