segmentio / analytics-swift

The hassle-free way to add Segment analytics to your Swift app (iOS/tvOS/watchOS/macOS/Linux).
MIT License
102 stars 85 forks source link

Crash in Analytics.configuration.getter #286

Closed haugli closed 7 months ago

haugli commented 10 months ago

Describe the bug After integrating the Segment SDK in our iOS app, we've started seeing a number of crashes in an internal access of Analytics.configuration. This is currently our app's top crasher, although only a small percentage of users have experienced it.

To Reproduce We haven't found a way to reproduce this crash, as it seems like it happens under specific timing that is hard to replicate.

The crash seems to always occur a few seconds after launching the app, immediately after a successful /projects/{key}/settings fetch.

Platform:

Stack Traces There seem to be two common forms of the stack traces for this crash, below is an example of each.

Example 1:

Exception Type: EXC_BREAKPOINT (SIGTRAP)
Crashed Thread: 0

Application Specific Information:
Exception 6, Code 1, Subcode 4355499808

Thread 0 Crashed:
0   MyApp                           0x1039ba720         [inlined] value
1   MyApp                           0x1039ba720         [inlined] Analytics.configuration.getter (Analytics.swift:17)
2   MyApp                           0x1039ba720         IntervalBasedFlushPolicy.configure (IntervalBasedFlushPolicy.swift:42)
3   MyApp                           0x1039c2950         [inlined] thunk for closure
4   MyApp                           0x1039c2950         thunk for closure
5   MyApp                           0x1039c697c         [inlined] thunk for closure
6   MyApp                           0x1039c697c         thunk for closure
7   MyApp                           0x1039c6a24         [inlined] Store.notify<T> (Store.swift:250)
8   MyApp                           0x1039c6a24         Store.notify<T>
9   MyApp                           0x1039c4d90         thunk for closure
10  libdispatch.dylib               0x34d4606a4         _dispatch_call_block_and_release
11  libdispatch.dylib               0x34d4622fc         _dispatch_client_callout
12  libdispatch.dylib               0x34d470994         _dispatch_main_queue_drain
13  libdispatch.dylib               0x34d4705ac         _dispatch_main_queue_callback_4CF
14  CoreFoundation                  0x33d523208         __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__
15  CoreFoundation                  0x33d51ff14         __CFRunLoopRun
16  CoreFoundation                  0x33d51f664         CFRunLoopRunSpecific
17  GraphicsServices                0x3c3ce55e8         GSEventRunModal
18  UIKitCore                       0x341b602b0         -[UIApplication _run]
19  UIKitCore                       0x341b5f8ec         UIApplicationMain
20  SwiftUI                         0x34602e0f8         OUTLINED_FUNCTION_31
21  SwiftUI                         0x34602df3c         OUTLINED_FUNCTION_31
22  SwiftUI                         0x345c9f864         OUTLINED_FUNCTION_26
23  MyApp                           0x200d1f11c         [inlined] AppLauncher.main (MyApp.swift:51)
24  MyApp                           0x200d1f11c         [inlined] AppLauncher.$main (<compiler-generated>:41)
25  MyApp                           0x200d1f11c         main
26  <unknown>                       0x1d826edcc         <redacted>

Example 2:

Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: SEGV_NOOP at 0x000047dab5b44360
Crashed Thread: 0

Application Specific Information:
Exception 1, Code 1, Subcode 79004676932448 >
KERN_INVALID_ADDRESS at 0x47dab5b44360.

Thread 0 Crashed:
0   libswiftCore.dylib              0x351b1611c         _swift_release_dealloc
1   libswiftCore.dylib              0x351b0d4e0         [inlined] swift_dynamicCast
2   libswiftCore.dylib              0x351b0d4e0         swift_dynamicCast
3   MyApp                           0x1073808d8         Store.currentState<T> (Store.swift:185)
4   MyApp                           0x10737656c         [inlined] Analytics.configuration.getter (Analytics.swift:17)
5   MyApp                           0x10737656c         IntervalBasedFlushPolicy.configure (IntervalBasedFlushPolicy.swift:42)
6   MyApp                           0x10737e950         [inlined] thunk for closure
7   MyApp                           0x10737e950         thunk for closure
8   MyApp                           0x10738297c         [inlined] thunk for closure
9   MyApp                           0x10738297c         thunk for closure
10  MyApp                           0x107382a24         [inlined] Store.notify<T> (Store.swift:250)
11  MyApp                           0x107382a24         Store.notify<T>
12  MyApp                           0x107380d90         thunk for closure
13  libdispatch.dylib               0x36c7dc45c         _dispatch_call_block_and_release
14  libdispatch.dylib               0x36c7ddf84         _dispatch_client_callout
15  libdispatch.dylib               0x36c7ec7f0         _dispatch_main_queue_drain
16  libdispatch.dylib               0x36c7ec440         _dispatch_main_queue_callback_4CF
17  CoreFoundation                  0x35dc386c4         __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__
18  CoreFoundation                  0x35dc1a028         __CFRunLoopRun
19  CoreFoundation                  0x35dc1eeac         CFRunLoopRunSpecific
20  GraphicsServices                0x3d2089364         GSEventRunModal
21  UIKitCore                       0x3622e9664         -[UIApplication _run]
22  UIKitCore                       0x3622e92c8         UIApplicationMain
23  SwiftUI                         0x3650fe240         OUTLINED_FUNCTION_895
24  SwiftUI                         0x36505f274         block_copy_helper.1
25  SwiftUI                         0x365048628         OUTLINED_FUNCTION_901
26  MyApp                           0x204cff11c         [inlined] AppLauncher.main (MyApp.swift:51)
27  MyApp                           0x204cff11c         [inlined] AppLauncher.$main (<compiler-generated>:41)
28  MyApp                           0x204cff11c         main
29  <unknown>                       0x1f55e0960         <redacted>
bsneed commented 10 months ago

Hi @haugli, thanks for the report! Can you share your analytics setup code with me (sans write key)?

haugli commented 10 months ago

Hi @haugli, thanks for the report! Can you share your analytics setup code with me (sans write key)?

Thanks for taking a look! Here's our setup code:

func setup() {
    let configuration = Configuration(writeKey: Self.writeKey)
      .trackApplicationLifecycleEvents(true)
      .apiHost(Self.apiHost)
      .cdnHost(Self.apiHost)
      .errorHandler { error in
        Log.segment.error("Segment encountered an error", error: error)
      }

    let analytics = Segment.Analytics(configuration: configuration)
    analytics.group(groupId: Self.groupId)
    self.analytics = analytics
}

We've also tried putting a lock around any access to self.analytics in case it's related to concurrency, but the crash still occurs with the lock in place.

bsneed commented 10 months ago

No problem, thanks for sharing! Did you see this prior to 1.5.x or is it new, do you know? Would be cool to try to isolate when it started so I can comb through the changelogs since neither of us have a repro scenario.

haugli commented 10 months ago

We've seen this crash since we first integrated Segment (which was using v1.4.7), but it may also have been there in previous releases.

In some of the stack traces, I see that Store is being accessed from multiple threads at the same time, which could possibly be a clue:

Exception Type: EXC_BREAKPOINT (SIGTRAP)
Crashed Thread: 0

Application Specific Information:
Exception 6, Code 1, Subcode 4408043296

Thread 0 Crashed:
0   MyApp                           0x106bd6720         [inlined] value
1   MyApp                           0x106bd6720         [inlined] Analytics.configuration.getter (Analytics.swift:17)
2   MyApp                           0x106bd6720         IntervalBasedFlushPolicy.configure (IntervalBasedFlushPolicy.swift:42)
3   MyApp                           0x106bde950         [inlined] thunk for closure
4   MyApp                           0x106bde950         thunk for closure
5   MyApp                           0x106be297c         [inlined] thunk for closure
6   MyApp                           0x106be297c         thunk for closure
7   MyApp                           0x106be2a24         [inlined] Store.notify<T> (Store.swift:250)
8   MyApp                           0x106be2a24         Store.notify<T>
9   MyApp                           0x106be0d90         thunk for closure
10  libdispatch.dylib               0x32b65a31c         _dispatch_call_block_and_release
11  libdispatch.dylib               0x32b65bea8         _dispatch_client_callout
12  libdispatch.dylib               0x32b66a6a0         _dispatch_main_queue_drain
13  libdispatch.dylib               0x32b66a2f0         _dispatch_main_queue_callback_4CF
14  CoreFoundation                  0x31cd56c24         __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__
15  CoreFoundation                  0x31cd3855c         __CFRunLoopRun
16  CoreFoundation                  0x31cd3d3e8         CFRunLoopRunSpecific
17  GraphicsServices                0x3937e7358         GSEventRunModal
18  UIKitCore                       0x321138f54         -[UIApplication _run]
19  UIKitCore                       0x321138bb8         UIApplicationMain
20  SwiftUI                         0x323fa2c4c         OUTLINED_FUNCTION_895
21  SwiftUI                         0x323f091e8         block_copy_helper.1
22  SwiftUI                         0x323ef3290         OUTLINED_FUNCTION_901
23  MyApp                           0x2044d711c         [inlined] AppLauncher.main (MyApp.swift:51)
24  MyApp                           0x2044d711c         [inlined] AppLauncher.$main (<compiler-generated>:41)
25  MyApp                           0x2044d711c         main
26  <unknown>                       0x1b5248dec         <redacted>

Thread 8
0   libswiftCore.dylib              0x310ad361c         tryCast
1   libswiftCore.dylib              0x310ad3420         [inlined] swift_dynamicCast
2   libswiftCore.dylib              0x310ad3420         swift_dynamicCast
3   MyApp                           0x106bded84         Store.notify<T> (Store.swift:245)
4   MyApp                           0x106bdfbb0         Store.dispatch<T> (Store.swift:137)
5   MyApp                           0x106ba7264         [inlined] Analytics.updateType (Settings.swift:147)
6   MyApp                           0x106ba7264         Analytics.update (Settings.swift:114)
7   MyApp                           0x106b8f020         [inlined] Timeline.apply (Timeline.swift:94)
8   MyApp                           0x106b8f020         [inlined] Sequence.forEach
9   MyApp                           0x106b8f020         [inlined] Timeline.apply (Timeline.swift:93)
10  MyApp                           0x106b8f020         DestinationPlugin.apply (Plugins.swift:113)
11  MyApp                           0x106b90400         [inlined] Timeline.apply (Timeline.swift:96)
12  MyApp                           0x106b90400         [inlined] Sequence.forEach
13  MyApp                           0x106b90400         [inlined] Timeline.apply (Timeline.swift:93)
14  MyApp                           0x106b90400         Analytics.apply (<compiler-generated>:178)
15  MyApp                           0x106ba78c8         [inlined] Analytics.apply
16  MyApp                           0x106ba78c8         [inlined] Analytics.update (Settings.swift:113)
17  MyApp                           0x106ba78c8         Analytics.checkSettings (Settings.swift:183)
18  MyApp                           0x106bc7eb4         HTTPClient.settingsFor (HTTPClient.swift:122)
19  MyApp                           0x106bc77e4         thunk for closure
20  CFNetwork                       0x31ed02098         CFURLRequestSetMainDocumentURL
21  CFNetwork                       0x31ed11bd4         _CFNetworkErrorCopyLocalizedDescriptionWithHostname
22  libdispatch.dylib               0x32b65a31c         _dispatch_call_block_and_release
23  libdispatch.dylib               0x32b65bea8         _dispatch_client_callout
24  libdispatch.dylib               0x32b663530         _dispatch_lane_serial_drain
25  libdispatch.dylib               0x32b6640d4         _dispatch_lane_invoke
26  libdispatch.dylib               0x32b66ecd8         _dispatch_workloop_worker_thread
27  libsystem_pthread.dylib         0x3dc932dd8         _pthread_wqthread
tristan-warner-smith commented 10 months ago

Could be related to the issue we're seeing.

alanjcharles commented 9 months ago

Hi @haugli- thanks for your patience, we're still looking into this. I'll let you know as soon as we have more to share. In the meantime, please let me know if you get any additional information/insight on your end. Thanks, talk soon!

alanjcharles commented 9 months ago

@haugli we just released a fix for this in the IntervalPolicy the retrieves the configuration from the systemState instead of the analytics instance. Let us know if you're still running into issues!

haugli commented 9 months ago

Amazing, thank you @alanjcharles! We'll upgrade to the latest version and let you know if there are any issues.

erichoracek commented 8 months ago

Hi @alanjcharles, we've released an app update that integrates v1.5.2 and this issue unfortunately still appears to be present (albeit happening somewhat less frequently). Here's an example stack trace:

OS Version: iOS 17.2.1 (21C66)
Report Version: 104

Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: SEGV_NOOP at 0x0000328f49f1f7a0
Crashed Thread: 0

Application Specific Information:
Exception 1, Code 1, Subcode 55591002306464 >
KERN_INVALID_ADDRESS at 0x328f49f1f7a0.

Thread 0 Crashed:
0   libswiftCore.dylib              0x3264708fc         _swift_release_dealloc
1   libswiftCore.dylib              0x326471fac         [inlined] swift::RefCounts<T>::doDecrementSlow<T>
2   libswiftCore.dylib              0x326471fac         swift::RefCounts<T>::doDecrementSlow<T>
3   libswiftCore.dylib              0x3264654d4         swift_dynamicCast
4   MyApp                           0x103de6e44         Store.currentState<T> (Store.swift:185)
5   MyApp                           0x103ddcdc0         IntervalBasedFlushPolicy.configure (IntervalBasedFlushPolicy.swift:42)
6   MyApp                           0x103de4ebc         [inlined] thunk for closure
7   MyApp                           0x103de4ebc         thunk for closure
8   MyApp                           0x103de8ee8         [inlined] thunk for closure
9   MyApp                           0x103de8ee8         thunk for closure
10  MyApp                           0x103de8f90         [inlined] Store.notify<T> (Store.swift:250)
11  MyApp                           0x103de8f90         Store.notify<T>
12  MyApp                           0x103de72fc         thunk for closure
13  libdispatch.dylib               0x3439246a4         _dispatch_call_block_and_release
14  libdispatch.dylib               0x3439262fc         _dispatch_client_callout
15  libdispatch.dylib               0x343934994         _dispatch_main_queue_drain
16  libdispatch.dylib               0x3439345ac         _dispatch_main_queue_callback_4CF
17  CoreFoundation                  0x333973018         __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__
18  CoreFoundation                  0x33396fd24         __CFRunLoopRun
19  CoreFoundation                  0x33396f474         CFRunLoopRunSpecific
20  GraphicsServices                0x3ba4a14f4         GSEventRunModal
21  UIKitCore                       0x337fbe628         -[UIApplication _run]
22  UIKitCore                       0x337fbdc64         UIApplicationMain
23  SwiftUI                         0x33c4a44b4         OUTLINED_FUNCTION_31
24  SwiftUI                         0x33c4a42f8         OUTLINED_FUNCTION_31
25  SwiftUI                         0x33c114e8c         OUTLINED_FUNCTION_26
26  MyApp                           0x20273b188         [inlined] AppLauncher.main (<redacted>)
27  MyApp                           0x20273b188         [inlined] AppLauncher.$main (<compiler-generated>:42)
28  MyApp                           0x20273b188         main
29  <unknown>                       0x1ce90edcc         <redacted>
alanjcharles commented 8 months ago

Hi @erichoracek thanks for letting us know- I'll see what I can find. Talk soon!

alanjcharles commented 8 months ago

Hi @erichoracek - I hope you're well. I have a few followup questions:

1. Are you certain the crash reports you're seeing are from users who have updated to the latest version of your app?

It can take quite awhile for your user base to fully update and since you are seeing a slight decrease since implementing the fix I just want to make sure we're accounting for users who might still be on an older version. On that note- are you still seeing a decrease or has it leveled off?

2. Is there anything particularly unique or bespoke about your Analytics implementation that might give us a clue?

We haven't seen this reported by anyone else and are having just as hard of a time as you had trying to replicate it at all, let alone consistently. Any additional details about your implementation you can provide might also be helpful.

We will likely have more questions based on your responses to these, but I think this is a good place to start to keep things organized. Please just let me know and we can go from there, thanks!

erichoracek commented 8 months ago

Hi @erichoracek - I hope you're well. I have a few followup questions:

Thanks for following up @alanjcharles!

1. Are you certain the crash reports you're seeing are from users who have updated to the latest version of your app?

It can take quite awhile for your user base to fully update and since you are seeing a slight decrease since implementing the fix I just want to make sure we're accounting for users who might still be on an older version.

Yes, we are only looking at crashes that occur in the most recent version of our app, and due to the changes made in v1.5.2 of the Segment SDK our crash reporting framework (Sentry) has categorized this as a new crash—if it was the same crash as before it would be grouped in with previous crashes. When we look at what versions that this new crash appears in, it is only present in the most recently released version of our app, which I've double-checked contains the commit that updates Segment to v1.5.2.

On that note- are you still seeing a decrease or has it leveled off?

It has not continued to decrease, it is steady in the new app version.

2. Is there anything particularly unique or bespoke about your Analytics implementation that might give us a clue?

We haven't seen this reported by anyone else and are having just as hard of a time as you had trying to replicate it at all, let alone consistently. Any additional details about your implementation you can provide might also be helpful.

We will likely have more questions based on your responses to these, but I think this is a good place to start to keep things organized. Please just let me know and we can go from there, thanks!

There is nothing too special, at app launch we create a configuration with a custom API/CDN host, tracking lifecycle events, with a custom user agent, and then instantiate a Segment.Analytics instance using that configuration. Whenever an analytics event occurs, we send it to via the track(…) method, and whenever the user info changes, we identify the user using the identify(…) method, and if the user info is cleared out, we call the reset() method.

erichoracek commented 8 months ago

@alanjcharles I took a quick look at the implementation and I see a few areas for investigation from these lines, which are the source of the crash:

guard let a = self.analytics else { return }
guard let system: System = a.store.currentState() else { return }

If it's helpful, enabling strict concurrency checking, adding Sendable conformances, and investigating the resulting warnings has been very helpful for us when diagnosing these types of issues in the past. If you add a Sendable conformance to System, Analytics, and Store the compiler will help enforce that they are thread-safe.

erichoracek commented 8 months ago

Here is another crash we've received a number of occurrences of that appears to be the same root cause (while accessing the System from the Sovran.store):

Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: SEGV_NOOP at 0x0000000000000009
Crashed Thread: 9

Application Specific Information:
Exception 1, Code 1, Subcode 9 >
KERN_INVALID_ADDRESS at 0x9.

Thread 9 Crashed:
0   libswiftCore.dylib              0x32c301d50         swift_retain
1   Redacted                        0x1033e40e4         System
2   libswiftCore.dylib              0x32c2f97c8         tryCast
3   libswiftCore.dylib              0x32c2f9d74         tryCast
4   libswiftCore.dylib              0x32c2f947c         swift_dynamicCast
5   Redacted                        0x10341850c         Store.currentState<T> (Store.swift:185)
6   Redacted                        0x1033d7e64         [inlined] Analytics.enabled.getter (Analytics.swift:122)
7   Redacted                        0x1033d7e64         SegmentDestination.flush (SegmentDestination.swift:125)
8   Redacted                        0x1033d8f28         SegmentDestination
9   Redacted                        0x1033a997c         Analytics.flush (Analytics.swift:220)
10  Redacted                        0x1033ab5bc         OperatingMode.run (Analytics.swift:446)
11  Redacted                        0x1033a9e6c         thunk for closure
12  libdispatch.dylib               0x3496b86a4         _dispatch_call_block_and_release
13  libdispatch.dylib               0x3496ba2fc         _dispatch_client_callout
14  libdispatch.dylib               0x3496c1890         _dispatch_lane_serial_drain
15  libdispatch.dylib               0x3496c23c0         _dispatch_lane_invoke
16  libdispatch.dylib               0x3496cd000         _dispatch_root_queue_drain_deferred_wlh
17  libdispatch.dylib               0x3496cc874         _dispatch_workloop_worker_thread
18  libsystem_pthread.dylib         0x40d4c9960         _pthread_wqthread
erichoracek commented 7 months ago

Hi @alanjcharles @bsneed , I spent some time looking into this and it appears to be a thread safety issue with the Sovran-Swift library. I've put together a fix for it here https://github.com/segmentio/Sovran-Swift/pull/10

alanjcharles commented 7 months ago

hey thanks @erichoracek! I'll take a look today

erichoracek commented 7 months ago

Thanks @alanjcharles, we've since deployed that fix and have confirmed that it resolves the crash.

alanjcharles commented 7 months ago

amazing. @haugli @erichoracek's fix is now available in Analytics-Swift 1.5.4 and Sovran-Swift 1.1.1 please feel free to follow up if you have any other issues or the crash persists. Thanks!