getsentry / sentry-cocoa

The official Sentry SDK for iOS, tvOS, macOS, watchOS.
https://sentry.io/for/cocoa/
MIT License
797 stars 315 forks source link

Crash during SDK initialization, presumably due to corrupted envelope data from a previous run #4280

Closed MrMage closed 1 week ago

MrMage commented 4 weeks ago

Platform

macOS

Environment

Production

Installed

CocoaPods

Version

8.33.0

Xcode Version

15.2

Did it work on previous versions?

No response

Steps to Reproduce

No repeatable reproduction steps just yet, but I hope that inspecting the code around SentrySerialization.m:225 is sufficient to guard against this issue in the future.

Expected Result

No crash, even when trying to process corrupted envelopes.

Actual Result

I have received a report of my app crashing at launch from a user. No reports of this crash could be found in Sentry, however.

Manual inspection of the crash report revealed the reason for that: The crash appears to be caused by a crash in Sentry itself while it attempts to send cached envelopes, possibly for a previous crash. My suspicion is that the previous envelope is corrupted, and the code responsible for sending the cached envelopes is not prepared for that.

I can't share a full crash report at this time (it wouldn't be very useful without my debug symbols), anyway. However, I have manually desymbolicated the stack trace, and the following call sites are involved:

sentrycrashfu_lastPathEntry (in XXX) (SentryCrashFileUtils.c:190)
-[SentryCrashExceptionApplication reportException:] (in XXX) (SentryCrashExceptionApplication.m:19)
+[SentrySerialization envelopeWithData:] (in XXX) (SentrySerialization.m:225)
+[SentrySerialization envelopeWithData:] (in XXX) (SentrySerialization.m:225)
-[SentryHttpTransport sendAllCachedEnvelopes] (in XXX) (SentryHttpTransport.m:291)
-[SentryHttpTransport initWithOptions:cachedEnvelopeSendDelay:fileManager:requestManager:requestBuilder:rateLimits:envelopeRateLimit:dispatchQueueWrapper:] (in XXX) (SentryHttpTransport.m:97)
+[SentryTransportFactory initTransports:sentryFileManager:currentDateProvider:] (in XXX) (SentryTransportFactory.m:69)
-[SentryClient initWithOptions:fileManager:deleteOldEnvelopeItems:] (in XXX) (SentryClient.m:103)
-[SentryClient initWithOptions:dispatchQueue:deleteOldEnvelopeItems:] (in XXX) (SentryClient.m:93)
-[SentryClient initWithOptions:] (in XXX) (SentryClient.m:75)
+[SentrySDK startWithOptions:] (in XXX) (SentrySDK.m:215)
+[SentrySDK startWithConfigureOptions:] (in XXX) (SentrySDK.m:255)

SentrySerialization.m:225 is NSData *itemBody = [data subdataWithRange:NSMakeRange(i + 1, bodyLength)];. My suspicion is that this line is called with invalid values, causing an out-of-bounds exception. As the SDK is still in its own startup phase, it can not catch the resulting exception nor upload a crash report for itself.

Are you willing to submit a PR?

No, but see my suggestion for a fix below.

MrMage commented 4 weeks ago

Update: I was able to reliably reproduce the crash by e.g. putting the following envelope file in my Sentry SDK's "envelopes" directory:

{"sdk":{"name":"sentry.cocoa","version":"8.33.0","packages":{"name":"cocoapods:getsentry\/sentry.cocoa","version":"8.33.0"}}}
{"type":"session","length":316}

Note how there is no attachment after the {"type":"session","length":316} line, and that is exactly the cause of the crash.

The triggered exception is Thread 1: "*** -[Foundation.__NSSwiftData subdataWithRange:]: range {158, 316} exceeds data length 157".

MrMage commented 4 weeks ago

I have since received a second report from another user of the same crash. This is particularly bad because I can't even tell how many users are affected in total specifically because it happens before the Sentry SDK has been fully initialized, so no crash reports get sent whatsoever.

kahest commented 4 weeks ago

@MrMage thank you for the detailed report, we'll investigate this shortly

MrMage commented 4 weeks ago

Thanks for the quick reply! FYI, I have confirmed that the following two extra if statements can be used to work around the issue:

            if (i > data.length) {
                // The following read would be out of bounds; for a well-formed envelope, this would not happen, but we
                // still guard against it to avoid a crash in case of a corrupted envelope.
                break;
            }
            NSData *itemHeaderData =
                [data subdataWithRange:NSMakeRange(itemHeaderStart, i - itemHeaderStart)];
            if (i + 1 + bodyLength > data.length) {
                // The following read would be out of bounds; for a well-formed envelope, this would not happen, but we
                // still guard against it to avoid a crash in case of a corrupted envelope.
                break;
            }
            NSData *itemBody = [data subdataWithRange:NSMakeRange(i + 1, bodyLength)];

However, these do not contain any SentryLog statements yet, and I have also not written any tests for them. So you could use this as a starting point for a fix, but I would strongly recommend to add tests that confirm the issue in the current state, and also confirm that the fix works under all circumstances (e.g. just 1 byte of attachment length, etc.).

(Also, I just noticed that the code I inserted uses tabs instead of spaces; sorry about that 😅)

philipphofmann commented 4 weeks ago

SentrySerialization.dataWithEnvelope does a couple of calls to NSMakeRange that can easily lead to a crash if not handled carefully.

https://github.com/getsentry/sentry-cocoa/blob/464117d76c03363a8a5699b3edd07c57398c9518/Sources/Sentry/SentrySerialization.m#L95

https://github.com/getsentry/sentry-cocoa/pull/4281 could fix one part of the problem, but we have to investigate further what could cause this problem.

grzegorzkrukowski commented 3 weeks ago

We are getting similar reports from multiple users:

*** Terminating app due to uncaught exception 'NSRangeException', reason: '*** -[Foundation.__NSSwiftData subdataWithRange:]: range {158, 290} exceeds data length 157'
*** First throw call stack:
(
    0   CoreFoundation                      0x00000001977532ec __exceptionPreprocess + 176
    1   libobjc.A.dylib                     0x000000019723a788 objc_exception_throw + 60
    2   Foundation                          0x00000001987ff63c -[NSProgress initWithParent:userInfo:] + 0
kahest commented 3 weeks ago

@grzegorzkrukowski thanks for reporting, can you upgrade to version 8.34.0? Also does this happen only on macOS, or do you also see reports on other platforms/devices?

grzegorzkrukowski commented 3 weeks ago

@grzegorzkrukowski thanks for reporting, can you upgrade to version 8.34.0? Also does this happen only on macOS, or do you also see reports on other platforms/devices?

We are planning to test a new release with 8.34.0 with our affected users today (the crash is from 8.33). Also, we are only on macOS, so we cannot confirm any crashes on any other platforms, sorry.

kahest commented 3 weeks ago

@grzegorzkrukowski thanks for the quick reply, that's helpful - please report back if the problems persist with 8.34.0.

bruno-garcia commented 3 weeks ago

Came up via support too, they seem to be also on 8.33.0 but not sure if issue started after an upgrade.

kahest commented 3 weeks ago

Update: we decided to revert GH-4219 as we suspect it might include the change that can write invalid envelope records. This will be released as part of 8.35.0. We will spend some time to re-visit GH-4219 and make sure we can land the memory footprint optimization without side effects.

dylancom commented 2 weeks ago

I get this crash also on 8.32.0, which didn't include GH-4219.

objc_exception_throw + 60 (objc-exception.mm:356)
[NSData(NSData) subdataWithRange:] + 596 (NSData.m:674)
[SentrySerialization envelopeWithData:] + 1360 (SentrySerialization.m:225)
[SentryHttpTransport sendAllCachedEnvelopes]
[SentryHttpTransport initWithOptions:cachedEnvelopeSendDelay:fileManager:requestManager:requestBuilder:rateLimits:envelopeRateLimit:dispatchQueueWrapper:]
[SentryTransportFactory initTransports:sentryFileManager:currentDateProvider:]
[SentryClient initWithOptions:fileManager:deleteOldEnvelopeItems:]
[SentryClient initWithOptions:dispatchQueue:deleteOldEnvelopeItems:
[SentryClient initWithOptions:] + 76 (SentryClient.m:75)
[SentrySDK startWithOptions:] + 444 (SentrySDK.m:215)
kahest commented 2 weeks ago

@dylancom did you use a newer SDK version before and then downgrade? Can you provide more information on the platforms (iOS, macOS,...) and options you have enabled?

dylancom commented 2 weeks ago

It happened after I upgraded from"@sentry/react-native": "^5.16.0" to "@sentry/react-native": "^5.29.0". Which upgraded Sentry/HybridSDK (= 8.17.1) to Sentry/HybridSDK (8.33.0). I tried to downgrade to "@sentry/react-native": "5.27.0", which relies on Sentry/HybridSDK (8.32.0) but it still caused the same crash.

So now I went back to "@sentry/react-native": "^5.16.0".

Platform: iOS, React Native. App Wrapped in Sentry.Wrap(). With just one option: tracesSampleRate: 0.0.

kahest commented 2 weeks ago

@dylancom thank you for the quick reply and confirming the versions. The most probable cause is that sentry-cocoa 8.33.0 created the invalid envelope, versions of 8.34.0+ include a fix for the crash, and 8.35.0+ include the fix for the envelopes. We will ship a release of @sentry/react-native with both fixes soon

MrMage commented 2 weeks ago

@dylancom thank you for the quick reply and confirming the versions. The most probable cause is that sentry-cocoa 8.33.0 created the invalid envelope, versions of 8.34.0+ include a fix for the crash, and 8.35.0+ include the fix for the envelopes. We will ship a release of @sentry/react-native with both fixes soon

@kahest, did you consider making your guard against the crash more defensive, as I suggested in https://github.com/getsentry/sentry-cocoa/issues/4280#issuecomment-2286657051? Currently, the code is checking for an envelope missing altogether, but not for a partially-written envelope. Would it make sense to also check for partially-written envelopes (i.e. >= instead of ==), just to be even more robust and defensive against issues like this?

kahest commented 2 weeks ago

@mrmage yes, see https://github.com/getsentry/sentry-cocoa/pull/4297. For clarity/context, we didn't see crashes in SDK versions that have the original fix (8.34+), therefore we consider the fix effective, but we're adding another safeguard and tests now.

MrMage commented 2 weeks ago

@MrMage yes, see #4297. For clarity/context, we didn't see crashes in SDK versions that have the original fix (8.34+), therefore we consider the fix effective, but we're adding another safeguard and tests now.

Thank you! I agree that the fix is most likely effective under nearly all circumstances, but given that the impact of the issue when it occurs is fairly catastrophic (a crash without any crash reporting), I appreciate the additional safeguards, just in case.

kahest commented 1 week ago

Closing this now. The crash and root cause are fixed in 8.35.0+. Further work on reducing memory footprint during envelope serialization is tracked in https://github.com/getsentry/sentry-cocoa/issues/3630.