NordicSemiconductor / IOS-DFU-Library

OTA DFU Library for Mac and iOS, compatible with nRF5x SoCs
http://www.nordicsemi.com
BSD 3-Clause "New" or "Revised" License
526 stars 215 forks source link

Accumulating crashes #465

Closed DevAndArtist closed 2 years ago

DevAndArtist commented 3 years ago

DFU Bootloader version:

Device information:

Describe the bug:

I personally couldn't reproduce the issue on my device, but we're getting more crash reports of the exact same issue from the App Store.

So the library sets the value to nil but it also expects it to be non-nil at some point.

I also found that on 30. Jul this exact maxSize was changed from being 0 to nil:

https://github.com/NordicSemiconductor/IOS-DFU-Library/commit/cb29e88a06944ba559e2015cef61d4a4e677330c#diff-f6837471832c7cb7d1597a7ce159f2300672e4b727ee83be65b8c55918a74958L290-R290

Logs:

Stack trace from the crashes (always the same):

0  App                            0x645558 closure #1 in SecureDFUPeripheral.selectCommandObject() + 197 (SecureDFUPeripheral.swift:197)
1  App                            0x63b6dc specialized SecureDFUControlPoint.peripheral(_:didUpdateValueFor:error:) + 633 (SecureDFUControlPoint.swift:633)
2  App                            0x639e78 @objc SecureDFUControlPoint.peripheral(_:didUpdateNotificationStateFor:error:) + 4313259640
3  CoreBluetooth                  0x1ec44 -[CBPeripheral handleAttributeEvent:args:attributeSelector:delegateSelector:delegateFlag:] + 228
4  CoreBluetooth                  0x1edb0 -[CBPeripheral handleCharacteristicEvent:characteristicSelector:delegateSelector:delegateFlag:] + 124
5  CoreBluetooth                  0x1b36c -[CBPeripheral handleMsg:args:] + 364
6  CoreBluetooth                  0x10150 -[CBCentralManager handleMsg:args:] + 196
7  CoreBluetooth                  0x37f78 -[CBManager xpcConnectionDidReceiveMsg:args:] + 208
8  CoreBluetooth                  0x2b874 __30-[CBXpcConnection _handleMsg:]_block_invoke + 68
9  libdispatch.dylib              0x2a84 _dispatch_call_block_and_release + 32
10 libdispatch.dylib              0x481c _dispatch_client_callout + 20
11 libdispatch.dylib              0xc004 _dispatch_lane_serial_drain + 620
12 libdispatch.dylib              0xcc34 _dispatch_lane_invoke + 456
13 libdispatch.dylib              0xbecc _dispatch_lane_serial_drain + 308
14 libdispatch.dylib              0xcc00 _dispatch_lane_invoke + 404
15 libdispatch.dylib              0x174bc _dispatch_workloop_worker_thread + 764
16 libsystem_pthread.dylib        0x37a4 _pthread_wqthread + 276
17 libsystem_pthread.dylib        0xa74c start_wqthread + 8
DevAndArtist commented 3 years ago

cc @philips77 any updates on this?

everlof commented 2 years ago

We've seen exactly this crash as well. Here's our stack trace:

* thread #1, queue = 'com.apple.main-thread', stop reason = Fatal error: Unexpectedly found nil while unwrapping an Optional value
    frame #0: 0x00000001a0f73060 libswiftCore.dylib`_swift_runtime_on_report
    frame #1: 0x00000001a0fde6a4 libswiftCore.dylib`_swift_stdlib_reportFatalErrorInFile + 188
    frame #2: 0x00000001a0cb0e90 libswiftCore.dylib`closure #1 (Swift.UnsafeBufferPointer<Swift.UInt8>) -> () in closure #1 (Swift.UnsafeBufferPointer<Swift.UInt8>) -> () in closure #1 (Swift.UnsafeBufferPointer<Swift.UInt8>) -> () in 
Swift._assertionFailure(_: Swift.StaticString, _: Swift.StaticString, file: Swift.StaticString, line: Swift.UInt, flags: Swift.UInt32) -> Swift.Never + 380
    frame #3: 0x00000001a0cb05a0 libswiftCore.dylib`Swift._assertionFailure(_: Swift.StaticString, _: Swift.StaticString, file: Swift.StaticString, line: Swift.UInt, flags: Swift.UInt32) -> Swift.Never + 376
  * frame #4: 0x0000000101473b60 Module`closure #1 in SecureDFUPeripheral.selectCommandObject(response=NordicDFU.SecureDFUResponse @ 0x000000016fa7dca0, self=0x0000000280cf2700) at SecureDFUPeripheral.swift:188:39
    frame #5: 0x0000000101464118 Module `SecureDFUControlPoint.peripheral(peripheral=0x00000002808a0fa0, characteristic=0x0000000281dc1500, error=nil, self=0x0000000281dc36c0) at SecureDFUControlPoint.swift:621:25
    frame #6: 0x0000000101464d98 Module `@objc SecureDFUControlPoint.peripheral(_:didUpdateValueFor:error:) at <compiler-generated>:0
    frame #7: 0x00000001b58d09c0 CoreBluetooth`-[CBPeripheral handleAttributeEvent:args:attributeSelector:delegateSelector:delegateFlag:] + 224
    frame #8: 0x00000001b58d0b04 CoreBluetooth`-[CBPeripheral handleCharacteristicEvent:characteristicSelector:delegateSelector:delegateFlag:] + 120
    frame #9: 0x00000001b58cd290 CoreBluetooth`-[CBPeripheral handleMsg:args:] + 352
    frame #10: 0x00000001b58c2b34 CoreBluetooth`-[CBCentralManager handleMsg:args:] + 184
    frame #11: 0x00000001b58e89e8 CoreBluetooth`-[CBManager xpcConnectionDidReceiveMsg:args:] + 204
    frame #12: 0x00000001b58dcc18 CoreBluetooth`__30-[CBXpcConnection _handleMsg:]_block_invoke + 64
    frame #13: 0x0000000107573ae8 libdispatch.dylib`_dispatch_call_block_and_release + 24
    frame #14: 0x000000010757532c libdispatch.dylib`_dispatch_client_callout + 16
    frame #15: 0x000000010757c38c libdispatch.dylib`_dispatch_lane_serial_drain + 748
    frame #16: 0x000000010757d044 libdispatch.dylib`_dispatch_lane_invoke + 500
    frame #17: 0x00000001075835e4 libdispatch.dylib`_dispatch_main_queue_callback_4CF + 580
    frame #18: 0x000000019d29e2e0 CoreFoundation`__CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__ + 12
    frame #19: 0x000000019d298740 CoreFoundation`__CFRunLoopRun + 2528
    frame #20: 0x000000019d297818 CoreFoundation`CFRunLoopRunSpecific + 572
    frame #21: 0x00000001b399d570 GraphicsServices`GSEventRunModal + 160
    frame #22: 0x000000019fbc30e8 UIKitCore`-[UIApplication _run] + 1052
    frame #23: 0x000000019fbc8664 UIKitCore`UIApplicationMain + 164
    frame #24: 0x0000000100cdc4e0 Module `main at AppDelegate.swift:17:7
    frame #25: 0x000000019cf76140 libdyld.dylib`start + 4

I've got the logs from the library as well, but I don' feel comfortable posting them here, if Nordic Semiconductors are interested I'd be happy to share them.

danielblock commented 2 years ago

Our iPhone 6s test device always gets this crash. For our newer devices it's not as common but we have seen it on an iPhone 13. Both devices are running iOS 15.x.

philips77 commented 2 years ago

Hi All,

@DevAndArtist, thank you for the excellent investigation. Sorry for coming here so late, I was busy with other projects. @everlof and @danielblock, thank you for your input. @danielblock could you share logs from logDelegate from that iPhone 6s? That would be very helpful.

philips77 commented 2 years ago

We're also observing the issue in some of our apps, but unable to crash it locally, so can't get logs. @dinesharjani has made a quick crash fix: https://github.com/NordicSemiconductor/IOS-DFU-Library/pull/480 which tries to ensure, that the response we got is actually for the request we sent. If not, it will abort DFU process.

However, I think that ignoring the incorrect response may be a better solution. Let me explain.

As you said, the crash happens when response is either .selectObject, or .calculateChecksum. The requests are never sent in parallel. When an object is fully sent, the library sends .calculateChecksum, receives a response, then sends .execute command, and only then .selectObject. Each commands awaits for the response. So it is not possible to get .calculateChecksum response for .selectObject command.

But.

The response for .calculateChecksum is IDENTICAL to a Package Receipt Notification. If PRNs are enabled, and the next one is expected JUST AFTER the last packet of an object was sent, we may have a race condition.

When the last data packet was sent, we call complete() here: https://github.com/NordicSemiconductor/IOS-DFU-Library/blob/cb29e88a06944ba559e2015cef61d4a4e677330c/iOSDFULibrary/Classes/Implementation/SecureDFU/Characteristics/SecureDFUPacket.swift#L157-L162 or here: https://github.com/NordicSemiconductor/IOS-DFU-Library/blob/cb29e88a06944ba559e2015cef61d4a4e677330c/iOSDFULibrary/Classes/Implementation/SecureDFU/Characteristics/SecureDFUPacket.swift#L231-243

This is propagated to https://github.com/NordicSemiconductor/IOS-DFU-Library/blob/cb29e88a06944ba559e2015cef61d4a4e677330c/iOSDFULibrary/Classes/Implementation/SecureDFU/Services/SecureDFUService.swift#L401-L411

which calls: https://github.com/NordicSemiconductor/IOS-DFU-Library/blob/cb29e88a06944ba559e2015cef61d4a4e677330c/iOSDFULibrary/Classes/Implementation/SecureDFU/Characteristics/SecureDFUControlPoint.swift#L400-L402

Now, have a look at the code where the notifications are handled: https://github.com/NordicSemiconductor/IOS-DFU-Library/blob/cb29e88a06944ba559e2015cef61d4a4e677330c/iOSDFULibrary/Classes/Implementation/SecureDFU/Characteristics/SecureDFUControlPoint.swift#L611-L634

If the PRN is expected after the last packet, the code awaits it, so there should be no issues. However, otherwise, no PRNs are expected later, when checksum or execute were requested. Bug it there comes one, proceed in line 613 is nil, so the PRN won't even be created line below, and will be propagated as SecureDFUResponse and decoded as a response to .calculateChecksum.

This seems to depend on the file sizes (last object may have different number of packets), the PRN value and the MTU set by the phone (different iPhones support different max MTU). There may be a combination of PRN, MTU, fw version, fw image size producing the issue. @danielblock could you also share the HEX files and ZIP files that you're using for DFU on that phone, apart from logs?

philips77 commented 2 years ago

480 and #483 should fix the crash, but perhaps the DFU process will freeze waiting for the .selectObject response.

My assumption was, that the .calculateChecksum response (or a PRN) came unwanted, but the proper response will follow, which should resume the upload.

If not, we can revert #483 and by keeping #480 the DFU will finish with an error (still better than a crash).

I'm going to make a release today. Please, test.