amplitude / Amplitude-Swift

Native iOS/tvOS/macOS/watchOS SDK
MIT License
32 stars 22 forks source link

Flushing events on the main queue causing an app hang #154

Closed Elih96 closed 5 months ago

Elih96 commented 7 months ago

We were experiencing the HTTP 400 bug due to having a user id whose length < 5 which was causing OOMs consistently. Changing the minIdLength property to 1 seemed to fix that issue.

However the subsequent issue is, from what I can see, flushing the local events that were previously failing to send. the issue seems to be due to flushing happening on the main thread and causing a hang during getEventsAsString from PersistentStorage

Here's a stack trace from one of the hangs: App Hang: The app was terminated while unresponsive

0 CoreFoundation +0x6b88 _CF_IS_OBJC 1 +0x703e8001adcf5a5c 0x703e8001adcf5a5c 2 CoreFoundation +0x4b90 _CFStringCompareWithOptionsAndLocale 3 Foundation +0x255ddc specialized UnsafeMutableBufferPointer.stableSortImpl(by:) 4 Foundation +0x255ad4 specialized MutableCollection<>.sort(by:) 5 Foundation +0x43d70 JSONWriter.serializeObject(:depth:) 6 Foundation +0x454d4 JSONWriter.serializeArray(:depth:) 7 Foundation +0x43b44 JSONWriter.serializeJSON(:depth:) 8 Foundation +0x43170 JSONEncoder.encode(:value:) 9 Foundation +0x42f08 JSONEncoder.encode(:) 10 Foundation +0x42ec0 dispatch thunk of JSONEncoder.encode(:) 11 relevant: PersistentStorage.eventsToJSONString(events:) 12 relevant: PersistentStorage.readV2File(content:) 13 relevant: PersistentStorage.getEventsString(eventBlock:) 14 relevant: protocol witness for Storage.getEventsString(eventBlock:) in conformance PersistentStorage 15 relevant: closure #1 in EventPipeline.flush(completion:) 16 relevant: thunk for @callee_guaranteed () -> () 17 relevant: thunk for @escaping @callee_guaranteed () -> () 18 libdispatch.dylib +0x42fc dispatch_client_callout 19 libdispatch.dylib +0x136b0 dispatch_lane_barrier_sync_invoke_and_complete 20 relevant: EventPipeline.flush(completion:) 21 relevant: closure #1 in EventPipeline.init(amplitude:) 22 relevant: closure #1 in QueueTimer.init(interval:once:queue:handler:) 23 relevant: thunk for @escaping @callee_guaranteed () -> ()

Possible Solution

The QueueTimer's initializer takes in a queue that's used to create the DispatchSourceTimer, it's currently defaulting to main. Moving it off the main queue might make sense, didn't really dive more into the code.

Steps to Reproduce

I think this is what's happening

  • no min id length -> events failing to send and being saved locally
  • events stacking up locally
  • min id length set -> events now sending -> app hang due to above

Environment

  • SDK Version: 1.4.3
crleona commented 5 months ago

This should be fixed in 1.5.0.