Closed haugli closed 7 months ago
Hi @haugli, thanks for the report! Can you share your analytics setup code with me (sans write key)?
Hi @haugli, thanks for the report! Can you share your analytics setup code with me (sans write key)?
Thanks for taking a look! Here's our setup code:
func setup() {
let configuration = Configuration(writeKey: Self.writeKey)
.trackApplicationLifecycleEvents(true)
.apiHost(Self.apiHost)
.cdnHost(Self.apiHost)
.errorHandler { error in
Log.segment.error("Segment encountered an error", error: error)
}
let analytics = Segment.Analytics(configuration: configuration)
analytics.group(groupId: Self.groupId)
self.analytics = analytics
}
We've also tried putting a lock around any access to self.analytics
in case it's related to concurrency, but the crash still occurs with the lock in place.
No problem, thanks for sharing! Did you see this prior to 1.5.x or is it new, do you know? Would be cool to try to isolate when it started so I can comb through the changelogs since neither of us have a repro scenario.
We've seen this crash since we first integrated Segment (which was using v1.4.7), but it may also have been there in previous releases.
In some of the stack traces, I see that Store
is being accessed from multiple threads at the same time, which could possibly be a clue:
Exception Type: EXC_BREAKPOINT (SIGTRAP)
Crashed Thread: 0
Application Specific Information:
Exception 6, Code 1, Subcode 4408043296
Thread 0 Crashed:
0 MyApp 0x106bd6720 [inlined] value
1 MyApp 0x106bd6720 [inlined] Analytics.configuration.getter (Analytics.swift:17)
2 MyApp 0x106bd6720 IntervalBasedFlushPolicy.configure (IntervalBasedFlushPolicy.swift:42)
3 MyApp 0x106bde950 [inlined] thunk for closure
4 MyApp 0x106bde950 thunk for closure
5 MyApp 0x106be297c [inlined] thunk for closure
6 MyApp 0x106be297c thunk for closure
7 MyApp 0x106be2a24 [inlined] Store.notify<T> (Store.swift:250)
8 MyApp 0x106be2a24 Store.notify<T>
9 MyApp 0x106be0d90 thunk for closure
10 libdispatch.dylib 0x32b65a31c _dispatch_call_block_and_release
11 libdispatch.dylib 0x32b65bea8 _dispatch_client_callout
12 libdispatch.dylib 0x32b66a6a0 _dispatch_main_queue_drain
13 libdispatch.dylib 0x32b66a2f0 _dispatch_main_queue_callback_4CF
14 CoreFoundation 0x31cd56c24 __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__
15 CoreFoundation 0x31cd3855c __CFRunLoopRun
16 CoreFoundation 0x31cd3d3e8 CFRunLoopRunSpecific
17 GraphicsServices 0x3937e7358 GSEventRunModal
18 UIKitCore 0x321138f54 -[UIApplication _run]
19 UIKitCore 0x321138bb8 UIApplicationMain
20 SwiftUI 0x323fa2c4c OUTLINED_FUNCTION_895
21 SwiftUI 0x323f091e8 block_copy_helper.1
22 SwiftUI 0x323ef3290 OUTLINED_FUNCTION_901
23 MyApp 0x2044d711c [inlined] AppLauncher.main (MyApp.swift:51)
24 MyApp 0x2044d711c [inlined] AppLauncher.$main (<compiler-generated>:41)
25 MyApp 0x2044d711c main
26 <unknown> 0x1b5248dec <redacted>
Thread 8
0 libswiftCore.dylib 0x310ad361c tryCast
1 libswiftCore.dylib 0x310ad3420 [inlined] swift_dynamicCast
2 libswiftCore.dylib 0x310ad3420 swift_dynamicCast
3 MyApp 0x106bded84 Store.notify<T> (Store.swift:245)
4 MyApp 0x106bdfbb0 Store.dispatch<T> (Store.swift:137)
5 MyApp 0x106ba7264 [inlined] Analytics.updateType (Settings.swift:147)
6 MyApp 0x106ba7264 Analytics.update (Settings.swift:114)
7 MyApp 0x106b8f020 [inlined] Timeline.apply (Timeline.swift:94)
8 MyApp 0x106b8f020 [inlined] Sequence.forEach
9 MyApp 0x106b8f020 [inlined] Timeline.apply (Timeline.swift:93)
10 MyApp 0x106b8f020 DestinationPlugin.apply (Plugins.swift:113)
11 MyApp 0x106b90400 [inlined] Timeline.apply (Timeline.swift:96)
12 MyApp 0x106b90400 [inlined] Sequence.forEach
13 MyApp 0x106b90400 [inlined] Timeline.apply (Timeline.swift:93)
14 MyApp 0x106b90400 Analytics.apply (<compiler-generated>:178)
15 MyApp 0x106ba78c8 [inlined] Analytics.apply
16 MyApp 0x106ba78c8 [inlined] Analytics.update (Settings.swift:113)
17 MyApp 0x106ba78c8 Analytics.checkSettings (Settings.swift:183)
18 MyApp 0x106bc7eb4 HTTPClient.settingsFor (HTTPClient.swift:122)
19 MyApp 0x106bc77e4 thunk for closure
20 CFNetwork 0x31ed02098 CFURLRequestSetMainDocumentURL
21 CFNetwork 0x31ed11bd4 _CFNetworkErrorCopyLocalizedDescriptionWithHostname
22 libdispatch.dylib 0x32b65a31c _dispatch_call_block_and_release
23 libdispatch.dylib 0x32b65bea8 _dispatch_client_callout
24 libdispatch.dylib 0x32b663530 _dispatch_lane_serial_drain
25 libdispatch.dylib 0x32b6640d4 _dispatch_lane_invoke
26 libdispatch.dylib 0x32b66ecd8 _dispatch_workloop_worker_thread
27 libsystem_pthread.dylib 0x3dc932dd8 _pthread_wqthread
Could be related to the issue we're seeing.
Hi @haugli- thanks for your patience, we're still looking into this. I'll let you know as soon as we have more to share. In the meantime, please let me know if you get any additional information/insight on your end. Thanks, talk soon!
@haugli we just released a fix for this in the IntervalPolicy
the retrieves the configuration from the systemState instead of the analytics instance. Let us know if you're still running into issues!
Amazing, thank you @alanjcharles! We'll upgrade to the latest version and let you know if there are any issues.
Hi @alanjcharles, we've released an app update that integrates v1.5.2 and this issue unfortunately still appears to be present (albeit happening somewhat less frequently). Here's an example stack trace:
OS Version: iOS 17.2.1 (21C66)
Report Version: 104
Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: SEGV_NOOP at 0x0000328f49f1f7a0
Crashed Thread: 0
Application Specific Information:
Exception 1, Code 1, Subcode 55591002306464 >
KERN_INVALID_ADDRESS at 0x328f49f1f7a0.
Thread 0 Crashed:
0 libswiftCore.dylib 0x3264708fc _swift_release_dealloc
1 libswiftCore.dylib 0x326471fac [inlined] swift::RefCounts<T>::doDecrementSlow<T>
2 libswiftCore.dylib 0x326471fac swift::RefCounts<T>::doDecrementSlow<T>
3 libswiftCore.dylib 0x3264654d4 swift_dynamicCast
4 MyApp 0x103de6e44 Store.currentState<T> (Store.swift:185)
5 MyApp 0x103ddcdc0 IntervalBasedFlushPolicy.configure (IntervalBasedFlushPolicy.swift:42)
6 MyApp 0x103de4ebc [inlined] thunk for closure
7 MyApp 0x103de4ebc thunk for closure
8 MyApp 0x103de8ee8 [inlined] thunk for closure
9 MyApp 0x103de8ee8 thunk for closure
10 MyApp 0x103de8f90 [inlined] Store.notify<T> (Store.swift:250)
11 MyApp 0x103de8f90 Store.notify<T>
12 MyApp 0x103de72fc thunk for closure
13 libdispatch.dylib 0x3439246a4 _dispatch_call_block_and_release
14 libdispatch.dylib 0x3439262fc _dispatch_client_callout
15 libdispatch.dylib 0x343934994 _dispatch_main_queue_drain
16 libdispatch.dylib 0x3439345ac _dispatch_main_queue_callback_4CF
17 CoreFoundation 0x333973018 __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__
18 CoreFoundation 0x33396fd24 __CFRunLoopRun
19 CoreFoundation 0x33396f474 CFRunLoopRunSpecific
20 GraphicsServices 0x3ba4a14f4 GSEventRunModal
21 UIKitCore 0x337fbe628 -[UIApplication _run]
22 UIKitCore 0x337fbdc64 UIApplicationMain
23 SwiftUI 0x33c4a44b4 OUTLINED_FUNCTION_31
24 SwiftUI 0x33c4a42f8 OUTLINED_FUNCTION_31
25 SwiftUI 0x33c114e8c OUTLINED_FUNCTION_26
26 MyApp 0x20273b188 [inlined] AppLauncher.main (<redacted>)
27 MyApp 0x20273b188 [inlined] AppLauncher.$main (<compiler-generated>:42)
28 MyApp 0x20273b188 main
29 <unknown> 0x1ce90edcc <redacted>
Hi @erichoracek thanks for letting us know- I'll see what I can find. Talk soon!
Hi @erichoracek - I hope you're well. I have a few followup questions:
1. Are you certain the crash reports you're seeing are from users who have updated to the latest version of your app?
It can take quite awhile for your user base to fully update and since you are seeing a slight decrease since implementing the fix I just want to make sure we're accounting for users who might still be on an older version. On that note- are you still seeing a decrease or has it leveled off?
2. Is there anything particularly unique or bespoke about your Analytics implementation that might give us a clue?
We haven't seen this reported by anyone else and are having just as hard of a time as you had trying to replicate it at all, let alone consistently. Any additional details about your implementation you can provide might also be helpful.
We will likely have more questions based on your responses to these, but I think this is a good place to start to keep things organized. Please just let me know and we can go from there, thanks!
Hi @erichoracek - I hope you're well. I have a few followup questions:
Thanks for following up @alanjcharles!
1. Are you certain the crash reports you're seeing are from users who have updated to the latest version of your app?
It can take quite awhile for your user base to fully update and since you are seeing a slight decrease since implementing the fix I just want to make sure we're accounting for users who might still be on an older version.
Yes, we are only looking at crashes that occur in the most recent version of our app, and due to the changes made in v1.5.2 of the Segment SDK our crash reporting framework (Sentry) has categorized this as a new crash—if it was the same crash as before it would be grouped in with previous crashes. When we look at what versions that this new crash appears in, it is only present in the most recently released version of our app, which I've double-checked contains the commit that updates Segment to v1.5.2.
On that note- are you still seeing a decrease or has it leveled off?
It has not continued to decrease, it is steady in the new app version.
2. Is there anything particularly unique or bespoke about your Analytics implementation that might give us a clue?
We haven't seen this reported by anyone else and are having just as hard of a time as you had trying to replicate it at all, let alone consistently. Any additional details about your implementation you can provide might also be helpful.
We will likely have more questions based on your responses to these, but I think this is a good place to start to keep things organized. Please just let me know and we can go from there, thanks!
There is nothing too special, at app launch we create a configuration with a custom API/CDN host, tracking lifecycle events, with a custom user agent, and then instantiate a Segment.Analytics
instance using that configuration. Whenever an analytics event occurs, we send it to via the track(…)
method, and whenever the user info changes, we identify the user using the identify(…)
method, and if the user info is cleared out, we call the reset()
method.
@alanjcharles I took a quick look at the implementation and I see a few areas for investigation from these lines, which are the source of the crash:
guard let a = self.analytics else { return }
guard let system: System = a.store.currentState() else { return }
Analytics.store
is a mutable property—I'm not sure if it is ever mutated but that has the potential for concurrent mutation during access (which could cause this crash) and mutable class properties are not thread safe.System
is not a full value type, it has some mutable classes contained within it. Notably Configuration
is a class with a mutable values
. I noticed that Sovran.Store
mentions "Behavior when applied to classes is currently undefined and will likely result in errors." This crash could be caused by that as well.If it's helpful, enabling strict concurrency checking, adding Sendable
conformances, and investigating the resulting warnings has been very helpful for us when diagnosing these types of issues in the past. If you add a Sendable
conformance to System
, Analytics
, and Store
the compiler will help enforce that they are thread-safe.
Here is another crash we've received a number of occurrences of that appears to be the same root cause (while accessing the System
from the Sovran.store
):
Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: SEGV_NOOP at 0x0000000000000009
Crashed Thread: 9
Application Specific Information:
Exception 1, Code 1, Subcode 9 >
KERN_INVALID_ADDRESS at 0x9.
Thread 9 Crashed:
0 libswiftCore.dylib 0x32c301d50 swift_retain
1 Redacted 0x1033e40e4 System
2 libswiftCore.dylib 0x32c2f97c8 tryCast
3 libswiftCore.dylib 0x32c2f9d74 tryCast
4 libswiftCore.dylib 0x32c2f947c swift_dynamicCast
5 Redacted 0x10341850c Store.currentState<T> (Store.swift:185)
6 Redacted 0x1033d7e64 [inlined] Analytics.enabled.getter (Analytics.swift:122)
7 Redacted 0x1033d7e64 SegmentDestination.flush (SegmentDestination.swift:125)
8 Redacted 0x1033d8f28 SegmentDestination
9 Redacted 0x1033a997c Analytics.flush (Analytics.swift:220)
10 Redacted 0x1033ab5bc OperatingMode.run (Analytics.swift:446)
11 Redacted 0x1033a9e6c thunk for closure
12 libdispatch.dylib 0x3496b86a4 _dispatch_call_block_and_release
13 libdispatch.dylib 0x3496ba2fc _dispatch_client_callout
14 libdispatch.dylib 0x3496c1890 _dispatch_lane_serial_drain
15 libdispatch.dylib 0x3496c23c0 _dispatch_lane_invoke
16 libdispatch.dylib 0x3496cd000 _dispatch_root_queue_drain_deferred_wlh
17 libdispatch.dylib 0x3496cc874 _dispatch_workloop_worker_thread
18 libsystem_pthread.dylib 0x40d4c9960 _pthread_wqthread
Hi @alanjcharles @bsneed , I spent some time looking into this and it appears to be a thread safety issue with the Sovran-Swift library. I've put together a fix for it here https://github.com/segmentio/Sovran-Swift/pull/10
hey thanks @erichoracek! I'll take a look today
Thanks @alanjcharles, we've since deployed that fix and have confirmed that it resolves the crash.
amazing. @haugli @erichoracek's fix is now available in Analytics-Swift 1.5.4
and Sovran-Swift 1.1.1
please feel free to follow up if you have any other issues or the crash persists. Thanks!
Describe the bug After integrating the Segment SDK in our iOS app, we've started seeing a number of crashes in an internal access of
Analytics.configuration
. This is currently our app's top crasher, although only a small percentage of users have experienced it.To Reproduce We haven't found a way to reproduce this crash, as it seems like it happens under specific timing that is hard to replicate.
The crash seems to always occur a few seconds after launching the app, immediately after a successful
/projects/{key}/settings
fetch.Platform:
Stack Traces There seem to be two common forms of the stack traces for this crash, below is an example of each.
Example 1:
Example 2: