We were experiencing the HTTP 400 bug due to having a user id whose length < 5 which was causing OOMs consistently.
Changing the minIdLength property to 1 seemed to fix that issue.
However the subsequent issue is, from what I can see, flushing the local events that were previously failing to send.
the issue seems to be due to flushing happening on the main thread and causing a hang during getEventsAsString from PersistentStorage
Here's a stack trace from one of the hangs:
App Hang: The app was terminated while unresponsive
0 CoreFoundation +0x6b88 _CF_IS_OBJC
1 +0x703e8001adcf5a5c 0x703e8001adcf5a5c
2 CoreFoundation +0x4b90 _CFStringCompareWithOptionsAndLocale
3 Foundation +0x255ddc specialized UnsafeMutableBufferPointer.stableSortImpl(by:)
4 Foundation +0x255ad4 specialized MutableCollection<>.sort(by:)
5 Foundation +0x43d70 JSONWriter.serializeObject(:depth:)
6 Foundation +0x454d4 JSONWriter.serializeArray(:depth:)
7 Foundation +0x43b44 JSONWriter.serializeJSON(:depth:)
8 Foundation +0x43170 JSONEncoder.encode(:value:)
9 Foundation +0x42f08 JSONEncoder.encode(:)
10 Foundation +0x42ec0 dispatch thunk of JSONEncoder.encode(:)
11 relevant: PersistentStorage.eventsToJSONString(events:)
12 relevant: PersistentStorage.readV2File(content:)
13 relevant: PersistentStorage.getEventsString(eventBlock:)
14 relevant: protocol witness for Storage.getEventsString(eventBlock:) in conformance PersistentStorage
15 relevant: closure #1 in EventPipeline.flush(completion:)
16 relevant: thunk for @callee_guaranteed () -> ()
17 relevant: thunk for @escaping @callee_guaranteed () -> ()
18 libdispatch.dylib +0x42fc dispatch_client_callout
19 libdispatch.dylib +0x136b0 dispatch_lane_barrier_sync_invoke_and_complete
20 relevant: EventPipeline.flush(completion:)
21 relevant: closure #1 in EventPipeline.init(amplitude:)
22 relevant: closure #1 in QueueTimer.init(interval:once:queue:handler:)
23 relevant: thunk for @escaping @callee_guaranteed () -> ()
Possible Solution
The QueueTimer's initializer takes in a queue that's used to create the DispatchSourceTimer, it's currently defaulting to main. Moving it off the main queue might make sense, didn't really dive more into the code.
Steps to Reproduce
I think this is what's happening
no min id length -> events failing to send and being saved locally
events stacking up locally
min id length set -> events now sending -> app hang due to above
We were experiencing the HTTP 400 bug due to having a user id whose length < 5 which was causing OOMs consistently. Changing the minIdLength property to 1 seemed to fix that issue.
However the subsequent issue is, from what I can see, flushing the local events that were previously failing to send. the issue seems to be due to flushing happening on the main thread and causing a hang during getEventsAsString from PersistentStorage
Here's a stack trace from one of the hangs: App Hang: The app was terminated while unresponsive
0 CoreFoundation +0x6b88 _CF_IS_OBJC 1 +0x703e8001adcf5a5c 0x703e8001adcf5a5c
2 CoreFoundation +0x4b90 _CFStringCompareWithOptionsAndLocale
3 Foundation +0x255ddc specialized UnsafeMutableBufferPointer.stableSortImpl(by:)
4 Foundation +0x255ad4 specialized MutableCollection<>.sort(by:)
5 Foundation +0x43d70 JSONWriter.serializeObject(:depth:)
6 Foundation +0x454d4 JSONWriter.serializeArray(:depth:)
7 Foundation +0x43b44 JSONWriter.serializeJSON(:depth:)
8 Foundation +0x43170 JSONEncoder.encode(:value:)
9 Foundation +0x42f08 JSONEncoder.encode(:)
10 Foundation +0x42ec0 dispatch thunk of JSONEncoder.encode(:)
11 relevant: PersistentStorage.eventsToJSONString(events:)
12 relevant: PersistentStorage.readV2File(content:)
13 relevant: PersistentStorage.getEventsString(eventBlock:)
14 relevant: protocol witness for Storage.getEventsString(eventBlock:) in conformance PersistentStorage
15 relevant: closure #1 in EventPipeline.flush(completion:)
16 relevant: thunk for @callee_guaranteed () -> ()
17 relevant: thunk for @escaping @callee_guaranteed () -> ()
18 libdispatch.dylib +0x42fc dispatch_client_callout
19 libdispatch.dylib +0x136b0 dispatch_lane_barrier_sync_invoke_and_complete
20 relevant: EventPipeline.flush(completion:)
21 relevant: closure #1 in EventPipeline.init(amplitude:)
22 relevant: closure #1 in QueueTimer.init(interval:once:queue:handler:)
23 relevant: thunk for @escaping @callee_guaranteed () -> ()
Possible Solution
The QueueTimer's initializer takes in a queue that's used to create the DispatchSourceTimer, it's currently defaulting to main. Moving it off the main queue might make sense, didn't really dive more into the code.
Steps to Reproduce
I think this is what's happening
Environment