Closed malcommac closed 1 year ago
Definitely something we're looking into. Not something we'll do in the next couple of months though, but sometime after that, this will become higher priority.
Is there something specific you have in mind? As we're in early stages, feedback can really help us drive the direction we take here
We have implemented MetricKit in our app. One disadvantage of MetricKit is that you can't add custom events or transactions easily, cause it gets a static string for transaction name in signPost
, and there's not ability to add transactions with runtime name which is very important.
Beside that, MetricKit works on iOS 13+ and its diagnostics metrics are only on iOS 14+.
Thanks for the input, @ialimz. Those are the primary reasons why we didn't get started with it yet.
The docs of MetricKit state
The system delivers metric reports about the previous 24 hours to a registered app at most once per day, and delivers diagnostic reports immediately in iOS 15 and later and macOS 12 and later.
We could look into how to merge the diagnostic reports that get delivered immediately somehow with transactions.
This is roughly how you can subscribe to the data of MetricKit.
class AppMetrics: NSObject, MXMetricManagerSubscriber {
func receiveReports() {
let shared = MXMetricManager.shared
shared.add(self)
}
func pauseReports() {
let shared = MXMetricManager.shared
shared.remove(self)
}
// Receive daily metrics.
func didReceive(_ payloads: [MXMetricPayload]) {
// Process metrics.
let enumerator = payloads.first?.applicationLaunchMetrics?.histogrammedTimeToFirstDraw.bucketEnumerator
while let object = enumerator?.nextObject() {
let bucket = object as! MXHistogramBucket<UnitDuration>
// ...
}
}
// Receive diagnostics immediately when available.
func didReceive(_ payloads: [MXDiagnosticPayload]) {
}
}
#import <MetricKit/MetricKit.h>
// ...
API_AVAILABLE(ios(16.0))
@interface SentryAppMetrics : NSObject<MXMetricManagerSubscriber>
@end
API_AVAILABLE(ios(16.0))
@implementation SentryAppMetrics
- (void)didReceiveDiagnosticPayloads:(NSArray<MXDiagnosticPayload *> *)payloads {
for (MXDiagnosticPayload *payload in payloads) {
for (MXAppLaunchDiagnostic * launchDiagnostic in payload.appLaunchDiagnostics) {
NSString *message = [NSString stringWithFormat:@"Launch duration: %f", launchDiagnostic.launchDuration.doubleValue];
[SentrySDK captureMessage:message];
}
}
}
// ...
if (@available(iOS 16.0, *)) {
self.appMetrics = [[SentryAppMetrics alloc] init];
}
@end
Plugin a real device and click Simulate MetricKit Payloads in Xcode.
Daily
Daily
Can we convert stack traces coming from MetricKit into Sentry events?
Can we convert stack traces coming from MetricKit into Sentry events?
I'd love it if this is possible.
I wrote up a state of the union of what MetricKit provides and what gaps/holes we currently do not track. Please let me know how we could best proceed 👍
Thanks a lot, @filip-doordash, for sharing your document.
I took a look at the MXCrashDiagnostic
. The MXCallStackTree
has one field JSONRepresentation
, which contains the information on the stack trace. We would need to parse this JSON and somehow convert it into a SentryEvent. It seems like this should be possible, but I'm not 100% sure yet if we get all the data to symbolicate the stacktrace in Sentry. So I'm pretty sure it's possible, but I'm not sure about the effort.
A significant downside with MetricKit is its delay of 24 to 48 hours. That's why we didn't adopt it yet because that delay could destroy your business if the roof is on fire. Furthermore, we cannot attach as much context as we can when creating the events in real time. One idea just popped into my head: why not send both? We could add an integration for MetricKit disabled by default, so you get all issues with less context and keep all the other features of the SentrySDK. Does that make sense, or do you wish to replace Sentry crash reporting with MetricKit MXCrashDiagnostic, @filip-doordash?
Useful resources:
I don't think I'd want to utilize MetricKit for MXCrashDiagnostic. I think our primary targets are:
but your strategy remains the same! we need to symbolicate the stack traces regardless
Thanks for clarifying that, @filip-doordash. Yes, the items you mentioned totally make sense, and the 24 to 48 hour delay won't be a huge deal for these.
I mainly see this as augmenting what's already available; eg. supplying breadcrumbs for CPU exceptions/disk errors/hang/app exit/cellular condition events, adding battery/CPU/GPU/memory/IO gauges, improving Sentry performance metrics with app launch times/FPS/animation performance.
@philipphofmann, eventually, I think a mix between MXCrashDiagnostic & Sentry's crash reporting service could be helpful. Primarily for OOM crashes, as you mentioned in Jan 2021, but other exceptions too.
I mean, if Sentry deduplicates crashes anyways, what's the harm? :)
For example, in the case of a memory leak, Sentry could report high-level details by saying, "hey, we're running out of memory on X app version", while MetricKit takes ~24-48h to gather the details of exactly why. It could be a great combo!
Nonetheless, hang rates, CPU exceptions, and disk write exceptions should all be unique call stacks that we could look at first. Let me know if I can help!
Thanks for the update, @filip-doordash. I hope I have the bandwidth to look closely at this in one or two weeks.
Our symbolication minimum needs frame.instruction_addr
, and debug_image.debug_id
, and depending on the addr_mode
also debugImage.image_addr
.
The MetricKit payload provides a stacktrace for MXDiagnostics in the form of a MXCallStackTree.
{
"callStackTree" : {
"callStackPerThread" : true,
"callStacks" : [
{
"threadAttributed" : false,
"callStackRootFrames" : [
{
"binaryUUID" : "70B89F27-1634-3580-A695-57CDB41D7743",
"offsetIntoBinaryTextSegment" : 165304,
"sampleCount" : 1,
"binaryName" : "MetricKitTestApp",
"address" : 7170766264
"subFrames" : [
{
"binaryUUID" : "77A62F2E-8212-30F3-84C1-E8497440ACF8",
"offsetIntoBinaryTextSegment" : 6948,
"sampleCount" : 1,
"binaryName" : "libdyld.dylib",
"address" : 7170808612
}
]
}
]
},
{
"threadAttributed" : true,
"callStackRootFrames" : [
...
binaryName
and binaryUUID
should map to our code_file
and debug_id
. I assume the address is absolute for the frame. Though offsetIntoBinaryTextSegment is an odd name. We want the load addr of the binary, because the text segment could, in theory, be at an arbitrary offset. So address
should map to our frame.instruction_addr
, and debugImage.image_addr
should be address - offsetIntoBinaryTextSegment
. We still need to validate that.
My current problem is that receiving a MXDiagnosticPayload via Xcode debug Simulate MeticKit Payload didn't work for me. Xcode only sends MXMetricPayloads for me.
We could also look into the on device symbolication of Meter if we don't make it work with the approach above.
Thanks, @Swatinem, for the help with this investigation.
Adding the proper context to the events the Cocoa SDK will generate will be a challenge, as the events will be from the past, and the current context could be outdated. The MXMetaData provides some context, but we might need some type of local cache for context data that changes.
As receiving an MXDiagnosticPayload via Xcode debug Simulate MeticKit Payload didn't work, I opened a PR to collect MetricKit payloads with our iOS-Swift sample app with https://github.com/getsentry/sentry-cocoa/pull/2316.
Important note
Apple can’t provide reports if users don’t share the data and statistics with app developers. If a user reports a crash and you don’t have a corresponding crash report, ask the user to share the crash data with app developers. Crash and energy data are automatically sent if you distribute an app using TestFlight but not if you distribute an app through the App Store. Metrics data is only shared for apps distributed through the App Store.
With https://github.com/getsentry/sentry-cocoa/pull/2316 we started to collect payloads from MetricKit via TestFlight. So far we have only received crash data. So the statement below seems to be true.
Crash and energy data are automatically sent if you distribute an app using TestFlight but not if you distribute an app through the App Store. Metrics data is only shared for apps distributed through the App Store.
Adding the SentryContext and scope to these payloads is going to be a challenge, indeed. The timeStampBegin
and timeStampEnd
, seem to be the only way to map our context to that payload. The only idea to make this work for me right now is to keep a versioned cache of the context. We would need to store all changes in the context of the disk. Or we go ahead with only attaching app, and device context, etc. , which doesn't frequently change in the first iteration. Breadcrumbs and attachments won't work for MetricKit events, but we could make it work for user, tags, environment and dist.
Let me know if I could help with anything!
@filip-doordash, do you maybe have some sample payloads without any details about Doordash you can share? I'm primarily interested in the meta data that it contains.
Sure! I copied your code from #2316 and included it in our next release. I should get data in the next couple of weeks (releases take a while!).
I'll post a redacted version of the data when it arrives.
Thank you very much, @filip-doordash 🥳 .
@philipphofmann I've attached 20 MXDiagnosticPayload samples. Let me know if that is sufficient or if you need more samples.
Thank you @filip-doordash 😀🙏. The 20 samples are sufficient for now.
Background
MetricKit gives insight app diagnostics and power and performance metrics. It differs between two different types of data sources:
The data of MXMetricPayload doesn't fit into Sentry at the moment, as it contains aggregated data delivered once a day. The proper place to put this data would be transactions, but as we receive the data without exact time information once daily, we can't map it to single transactions.
The data of MXDiagnosticPayload is suitable for our product. The payload contains stacktraces, which we can convert to SentryEvents. To symbolicate these stacktraces for Sentry, you need access to some internal Sentry Cocoa SDK APIs. Furthermore, the payload data is delivered in a JSON format, so every customer must implement JSON decoding. Instead, in the first iteration, we are going to provide an integration for creating Sentry events for the following types of diagnostics:
We don't report MXCrashDiagnostic, because we can attach more accurate context data with our own crash handler.
Implementation
The docs of MetricKit state
It seems like MetricKit only delivers payload data once you download your app via TestFlight. Even using the Xcode debug mechanism for debugging didn't trigger any sample payloads for crashes, hangs, disk, or CPU write exceptions. Once you download the app via Testflight, MetricKit delivers crash diagnostics on the next app launch after your app crashes, even while debugging. To trigger payloads for MXCPUExceptionDiagnostic and MXDiskWriteExceptionDiagnostic we add code to the iOS-Swift sample app, which causes a high CPU load and permanently writes to disk with https://github.com/getsentry/sentry-cocoa/pull/2476.