aws-amplify / aws-sdk-ios

AWS SDK for iOS. For more information, see our web site:
https://aws-amplify.github.io/docs
Other
1.68k stars 885 forks source link

Crash in AWSIoTMQTTClient.m line 635 -[AWSIoTMQTTClient openStreams:] #4404

Closed gadget-man closed 8 months ago

gadget-man commented 1 year ago

Describe the bug I'm seeing a lot of crash reports in Crashlytics which all point to the same line: AWSIoTMQTTClient.m line 635 -[AWSIoTMQTTClient openStreams:].

We are currently using AWS v2.27.13

Steps to reproduce the behavior: Very difficult to reproduce - only seen intermittently on device when resuming from background.

Environment SDK Version: [2.27.13] Dependency Manager: Cocoapods

Device Information

Stack Trace Crashed: Thread #1 EXC_BAD_ACCESS KERN_INVALID_ADDRESS 0x000000000000012e

Crashed: Thread 0 libobjc.A.dylib 0x6cf0 objc_opt_respondsToSelector + 52 1 CoreFoundation 0x13b14c _inputStreamCallbackFunc + 48 2 CoreFoundation 0x105fb4 _signalEventSync + 216 3 CoreFoundation 0x117e84 _cfstream_solo_signalEventSync + 224 4 CoreFoundation 0x10e42c _CFStreamSignalEvent + 304 5 CFNetwork 0x147eb8 _CFNetworkErrorGetLocalizedDescription + 180236 6 CFNetwork 0x1533e4 _CFNetworkErrorGetLocalizedDescription + 226616 7 CoreFoundation 0x11a41c CFSocketPerformV0 + 648 8 CoreFoundation 0xd5f54 CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION + 28 9 CoreFoundation 0xe232c CFRunLoopDoSource0 + 176 10 CoreFoundation 0x66210 CFRunLoopDoSources0 + 244 11 CoreFoundation 0x7bba8 CFRunLoopRun + 836 12 CoreFoundation 0x80ed4 CFRunLoopRunSpecific + 612 13 Foundation 0x42334 -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 212 14 AWSIoT 0x5a728 -[AWSIoTMQTTClient openStreams:] + 759 (AWSIoTMQTTClient.m:759) 15 Foundation 0x5b808 NSThreadstart__ + 716 16 libsystem_pthread.dylib 0x16cc _pthread_start + 148 17 libsystem_pthread.dylib 0xba4 thread_start + 8

I see a very similar issue was previously report at https://github.com/aws-amplify/aws-sdk-ios/issues/1209 however this was reported resolved in a prior build - we are still seeing it in 2.27.13.

lostyoung commented 1 year ago

We have also met this issue in SDK with version 2.27.7 and 2.28.5.

image
Banner2404 commented 1 year ago

The same issue in SDK 2.29.1

Banner2404 commented 1 year ago

I was able to reliably reproduce this issue (SDK 2.30.1) with the following code:

    func connect(_ iteration: Int) {
        print(iteration, "Connect")
        dataManager.connect(withClientId: UUID().uuidString, cleanSession: true, certificateId: "certificate") { status in
            print(iteration, "Connection callback", status.string)
        }
        DispatchQueue.main.asyncAfter(deadline: .now() + 0.2) {
            print(iteration, "Disconnect")
            self.dataManager.disconnect()
            print(iteration, "Reconnect")
            self.connect(iteration + 1)
        }
    }

Basically here I start a connection to MQTT broker, and after a small delay I terminate the connection and start another connection. In theory this can happen when user doesn't wait for connection to be established and switches to another IoT device.

The crash always occurs in background thread, however the stack trace may differ. Seems like it crashes in different parts of the AWSIoTMQTTClient code. The crash can also occur not immediately but only after 1, 2, 3... reconnection loops.

Screenshot 2023-02-03 at 11 00 26 Screenshot 2023-02-03 at 10 59 49
atierian commented 1 year ago

@Banner2404 I was able to reproduce this issue on 2.27.7, but not on 2.30.1. We also recently fixed a threading issue in IoT that was released in 2.30.3. Can you try that version to see if it resolves your issue? Thanks!

atierian commented 1 year ago

I was able to reproduce this issue on 2.27.7, but not on 2.30.1.

Actually, I was able to reproduce this on 2.30.1 and 2.30.2. However, it appears that the fix in 2.30.3 I mentioned above fixes this. I've run your repro test case 20 times each with at least 250 iterations of connect + disconnect + reconnect and haven't encountered the crash. Please upgrade to 2.30.3 to test this yourself and confirm whether this resolves your issue. Thanks for your patience!

Banner2404 commented 1 year ago

I tested with the same code snipped as above and the issue seems to be resolved in 2.30.4. We will update the production app to see if it solves the crashes for customers. Thank you @atierian !

atierian commented 1 year ago

Great, thanks for letting us know!

atierian commented 1 year ago

Closing this issue as resolved. If you encounter it on >= 2.30.3, please comment on this issue (feel free to ping me) and I'll reopen, or you can open a new issue. Thanks for your patience!

Banner2404 commented 1 year ago

@atierian Unfortunately, seems like the issue is still present in SDK 2.30.4. It's not that stable to reproduce anymore with my code snippet above. But when you run the snippet, put the app into background and then return from background, the app crashes sometimes. See the attached video for example. I also see that many crashes from users happen when the app enters/exits background. So it may be somehow related to reconnection logic, since iOS starts closing all sockets when the app enters background. Tested using Xcode 14.3 and simulator iOS 16.4

https://user-images.githubusercontent.com/17294609/230130219-d21850f2-3fcf-4d5f-91ad-50c12f1eeb08.mp4

atierian commented 1 year ago

Thanks for bringing this to our attention. I'm reopening the issue and we'll investigate. Can you please paste the full backtrace shown in the debug navigator on a crash? That'll help us confirm that we've reproduced the same issue you're experiencing. Thanks!

Banner2404 commented 1 year ago

Sure, here are some crash logs from end users. 2023-04-0507-31-08.1485-0400-8af3ebb440c3d3a5d246ad48206cefbacf7bbf79.txt (Most often) 2023-03-2807-40-54.0939-0700-eb8d1adb9770a04a3101d9750ff31a9d100c7035.txt 2023-04-0308-07-23.4956-0700-6e15f6cc62eb6eb2c21d025ce3a0b37fda004620.txt 2023-03-2718-36-01.1343-0400-4b140026dd4b073d7a20a46a01c25569fffbea7b.txt 2023-04-0406-26-47.6522-0700-fac7fde5541cd99f2c860e6a3bf0b4f02637684c.txt Let me know if you need more

atierian commented 1 year ago

Perfect, thanks! We'll follow up with here with any updates.

r-rebacz commented 11 months ago

@atierian any news on this? I'm also able to reproduce this issue with AWS SDK v2.33.4. Crashlog is very similar to ones attached by @Banner2404.

xaviervautier commented 11 months ago

We are also experiencing that issue. @atierian, please let us know if you have a timeline for resolution.

DanielHay-Biobeat commented 10 months ago

@atierian Any news on the issue? I'm also able to reproduce this issue with AWS SDK V2.33.4.

naresh-kumar-ios commented 9 months ago

@atierian any update on this issue ? I am also facing the same issue with v2.33.4 while syncing the data in background.

atierian commented 9 months ago

We're prioritizing looking into this issue and related IoT issues -- we'll provide an update when we have one. Thanks for all of your patience here.

jonduenas commented 9 months ago

Just chiming in to say we're encountering this crash as well on v2.30.4. Stack traces from our crash reports on Crashlytics

Crashed: Thread
0  libobjc.A.dylib                0x20dc class_rw_t::ro() const + 44
1  libobjc.A.dylib                0xaba0 realizeClassMaybeSwiftMaybeRelock(objc_class*, locker_mixin<lockdebug::lock_mixin<objc_lock_base_t> >&, bool) + 104
2  libobjc.A.dylib                0x9bb0 lookUpImpOrForward + 880
3  libobjc.A.dylib                0x4cc4 _objc_msgSend_uncached + 68
4  CoreFoundation                 0x126174 _inputStreamCallbackFunc + 48
5  CoreFoundation                 0xb8778 _signalEventSync + 216
6  CoreFoundation                 0xb8628 _cfstream_shared_signalEventSync + 392
7  CoreFoundation                 0x3731c __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 28
8  CoreFoundation                 0x36598 __CFRunLoopDoSource0 + 176
9  CoreFoundation                 0x34d4c __CFRunLoopDoSources0 + 244
10 CoreFoundation                 0x33a88 __CFRunLoopRun + 828
11 CoreFoundation                 0x33668 CFRunLoopRunSpecific + 608
12 Foundation                     0x2c54c -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 212
13 AWSIoT                         0x84dc -[AWSIoTMQTTClient openStreams:] + 759 (AWSIoTMQTTClient.m:759)
14 Foundation                     0xb1184 __NSThread__start__ + 732
15 libsystem_pthread.dylib        0x24d4 _pthread_start + 136
16 libsystem_pthread.dylib        0x1a10 thread_start + 8

Also

Crashed: Thread
0  libobjc.A.dylib                0x3c504 objc_opt_respondsToSelector + 48
1  CoreFoundation                 0x125fa4 <redacted> + 48
2  CoreFoundation                 0xb8570 <redacted> + 216
3  CoreFoundation                 0xb8420 <redacted> + 392
4  CoreFoundation                 0x370ac <redacted> + 28
5  CoreFoundation                 0x36328 <redacted> + 176
6  CoreFoundation                 0x34adc <redacted> + 244
7  CoreFoundation                 0x33818 <redacted> + 828
8  CoreFoundation                 0x333f8 CFRunLoopRunSpecific + 608
9  Foundation                     0x2c3ec <redacted> + 212
10 AWSIoT                         0x84dc -[AWSIoTMQTTClient openStreams:] + 759 (AWSIoTMQTTClient.m:759)
11 Foundation                     0xafd40 <redacted> + 732
12 libsystem_pthread.dylib        0x24d4 _pthread_start + 136
13 libsystem_pthread.dylib        0x1a10 thread_start + 8

Not sure why that second one has "redacted" everywhere. I can look into resolving that if it would be helpful.

lawmicha commented 8 months ago

Hello, we released 2.33.9 which includes an IoT crash fix https://github.com/aws-amplify/aws-sdk-ios/pull/5185. Please upgrade to this and let us know if you are still experiencing the issue.