Closed androidmitry closed 1 month ago
Hi @androidmitry ,
Thanks for the report, I'll look into this as soon as I can.
Can you give me anymore information about a possible reproduction? Have you been able to reproduce locally at all? Is there anything strange about your setup that might be disconnecting the MethodChannel from our plugin? We tend to wrap every call we make to try to avoid crashes, so I'm very concerned that this is causing a crash....
Hi @fuzzybinary , unfortunately thats all information I have so far. I wasn't able to reproduce it. We had some custom platform code, but it was removed. Whats interesting is that number of reports is decreasing. I will update the issue if crash goes away.
@androidmitry Yeah if you can keep me posted I would appreciate it.
I'm seen issues in the past where the method channel can get disconnected from the plugin, but I've fixed those, and most threw errors in the native layer, not Dart.
This issue is also occurring in version 2.1.0, and we have encountered the MissingPluginException from the Android channels datadog_sdk_flutter.rum and datadog_sdk_flutter.logs in the production release. Due to consecutive RUM events, the error count is excessively high. Below are some error messages we've received:
Hi @nirmal0707,
I'm actively investigating this, but I haven't had much reproducing. Do you happen to have any steps to reproduce, or anything you can tell me about your app before / after you started seeing the errors?
Hi @fuzzybinary ,
This issue was not reproducible but began occurring when we migrated our codebase to Flutter 3.16.4, three months ago. Previously, we were using version 1.5.1, and the Flutter upgrade required us to move the package version to 2.1.0, resulting in this issue arising for some users in production.
Alright, thanks @nirmal0707, That may help me track down the issue.
Hi folks -- a few questions for everyone to see if I can try to diagnose this:
flutter_background_service
?firebase_cloud_messaging
? Do these errors tend to spike immediately after a push notification is sent out?add to app
scenario, or using attachToExisting
in the SDK?GeneratedPluginRegistrant.java
enclose all the plugins in a try/catch block?MissingPluginException
errors correlate with any other errors around the same time?Sorry this is taking so long but I am having a really hard time reproducing, even when forcing certain error states, and. comparing with Crashlytics, we perform the registration and de-registration of our method channels the same way they do, so I'm not sure how or why they'd catch the errors and we don't.
Another question as I continue to investigate -- Does anyone have any customizations of their FlutterActivity? Overriding onCreate
, configureFlutterEngine
, onDestroy
or any other methods?
For us crash reports started coming when we upgraded flutter from 3.16.9 to 3.19.5
Is anyone using background tasks or foreground services
We have foreground service but we don't use flutter_background_service
package. Also according to breadcrumb events attached to crash it usually happens in foreground.
Are you using push notifications or a push notification service like firebase_cloud_messaging ? Do these errors tend to spike immediately after a push notification is sent out ?
Yes. No.
Is anyone using Flutter in an add to app scenario, or using attachToExisting in the SDK?
No
Does GeneratedPluginRegistrant.java enclose all the plugins in a try/catch block?
Yes
Do the MissingPluginException errors correlate with any other errors around the same time?
Checked several users and no other issues were reported around same time
Does anyone have any customizations of their FlutterActivity?
We do, I will double check them.
My error message is a bit different MissingPluginException(No implementation found for method reportLongTask on channel datadog_sdk_flutter.rum)
These are my Sentry logs:
Then a bunch of:
And then:
Maybe you are not handling the destroyed
lifecycle correctly? Or another plugin is interfering?
Hi @feinstein, thanks for the additional information. All of the MissingPluginException
issues are related, regardless of the method channel named and the method recorded, so any additional info is helpful.
The FlutterJNI error is interesting, that wouldn't be us so I'm very curious what might cause that, and curious if they're related.
We actually don't handle activity lifecycle at all, instead relying on Flutter's onAttachedToEngine
and onDetachedFromEngine
, which is what makes this error so frustrating, as those should be triggered properly when Flutter itself starts and stops.
Have you been able to reproduce locally at all?
AFAIK Flutter JNI is the Java interop for connecting the C++ Flutter engine to the Android app.
Maybe Flutter is not triggering the engine's life cycle correctly to your lib.
We were not able to reproduce it locally. We made some tiny changes to our FlutterActivity
, I will report if it helped.
@fuzzybinary just noting that we are still experiencing the issue mentioned in https://github.com/DataDog/dd-sdk-flutter/issues/552 (which I believe is the same issue being tracked here) despite removing the native cruft I referred to in my last comment on that issue. IIRC I am able to reproduce this in our application fairly consistently. If I have a sec today I'll play around and see if I can reproduce. According to another engineer on my team we're seeing ~249k instances of this issue per week. We've had to filter these issues out of our crash reporting to avoid going beyond our contracted threshold đ
STR would would be ridiculously helpful. If I can reproduce I can likely get it fixed and out with the next version ASAP.
@fuzzybinary Just chimining in again on @nirmal0707 behalf, looking at our Sentry error logs, we also see a large number of lifecycle events being reported in quick succession in the error events for this:
And the above screenshot is only about a quarter of the pause/resume breadcrumb events in that particular Sentry error event.
Not sure if thats relevant, but perhaps this rapid set of lifecycle events causes some sort of race condition in the Datadog plugins setup code?
This looks weird, so many transitions in under 1 second.
What makes me exclude a Flutter error is that only the DD plugin is raising this exception.... but on a second thought, few packages would trigger a method channel call when the app is being destroyed
Another question from research:
Is anyone suffering from this error still using runZonedGuarded
over PlatformDispatcher.instance.onError
? (If you are using Datadog.runApp
we do not use runZonedGuarded
)
I'm looking for commonalities here, since I cannot reproduce with any example I have, but all of my examples use PlatformDispatcher
.
We use runZonedGuarded
, is it deprecated ? We set PlatformDispatcher.instance.onError
as well
PlatformDispatcher.instance.onError
is preferred and the two do essentially the same thing.
I'm going to do more research but I'm curious if the new zone creation is occasionally bypassed by backgrounding / foregrounding.
Tests on my side related to runZonedGuarded
don't duplicate the issue unfortunately.
Next question -- is everyone experiencing this potentially using multiple Flutter engines or booting engines themselves for any reason? There is a potentially related Flutter issue if so. Doing a quick scan of the issue it's possible we might be able to fix this on the Datadog side, but knowing would help me focus efforts.
Thanks for your continued efforts on this @fuzzybinary ! đ
For our app we are not using multiple Flutter engines and we do use runZonedGuarded, though it seems thats likely not the source of the issue from your last comment.
I am also using runZonedGuarded
, I initialize Sentry, then DataDog. Here's how I initialize it:
Future<void> setupDatadog() async {
final configuration = DatadogConfiguration(
clientToken: 'mytoken1234',
env: appFlavor ?? 'no-flavour',
site: DatadogSite.us5,
nativeCrashReportEnabled: true,
loggingConfiguration: DatadogLoggingConfiguration(),
rumConfiguration: DatadogRumConfiguration(
applicationId: 'my-app-id-1234',
),
);
final originalOnError = FlutterError.onError;
FlutterError.onError = (details) {
DatadogSdk.instance.rum?.handleFlutterError(details);
originalOnError?.call(details); // This allows me to not override other listeners, like Sentry.
};
final platformOriginalOnError = PlatformDispatcher.instance.onError;
PlatformDispatcher.instance.onError = (e, st) {
DatadogSdk.instance.rum?.addErrorInfo(
e.toString(),
RumErrorSource.source,
stackTrace: st,
);
return platformOriginalOnError?.call(e, st) ?? false;
};
await DatadogSdk.instance.initialize(configuration, TrackingConsent.granted);
DatadogSdk.instance.updateConfigurationInfo(LateConfigurationProperty.trackErrors, true);
}
That function is called inside a runZonedGuarded
, after await SentryFlutter.init
and WidgetsFlutterBinding.ensureInitialized();
.
Hi folks - we still cannot reproduce this issue unfortunately. My guess is that this is some sort of race condition on the platform channel during backgrounding, where we are attempting to send view or log events while the app is backgrounding on Android.
However, I will say we do know that even though Sentry / Crashlytics report this as a âFatalâ error, it does not result in the application terminating, and is silent to the user. I verified this by essentially âforce disconnectingâ the method channel during testing and seeing what the response is from Flutter. This means that users are not seeing a degraded app experience because of this issue.
This doesnât mean we donât take the issue seriously, and if anyone can provide us with reproduction steps that would be incredibly helpful.
Maybe contact the flutter team and ask them what might be causing this?
I've gone through some of the less formal channels (Discord, for example), but I may raise a github issue and see if it gets more attention.
All previous changes I made didn't help. We are planning a flutter sdk upgrade. I will post here if it helps.
@fuzzybinary I've been unable to give this attention due to some other pressing work, but I wanted to respond to:
Next question -- is everyone experiencing this potentially using multiple Flutter engines or booting engines themselves for any reason? There is a https://github.com/flutter/flutter/issues/103483 if so. Doing a quick scan of the issue it's possible we might be able to fix this on the Datadog side, but knowing would help me focus efforts.
A coworker of mine was toying around with this and was able to confirm that there's a case where a user taps a deep link and in doing so a new Flutter engine gets created (I think because of some code we have on the native side, I doubt that this is default Flutter behavior). As a result, our main
function is called again which calls the code that would initialize Datadog
twice. My hunch (without really looking at the code on either side, I'm just leaving this comment between tasks) is that the move to a singleton on your end made this bug which was already occurring more obvious (because of all the errors we're seeing).
Obviously we have more triaging and likely some fixes to put in on our end, but I did want to (cautiously) confirm your hypothesis that the 2 engine thing may be one cause of the issue folks are seeing.
Thanks @btrautmann, that's really good information to have. I'm not sure if all of these issues are related to multiple Flutter engines, but I feel like its possible there are situations I don't know about that could legitimately create a second Flutter engine.
I'll have to think about how we can support that situation, but knowing that I can create a fake situation that artificially creates multiple engines and test that my solution works.
I'll try to get a solution for you in the next few weeks.
@fuzzybinary a couple interesting things to note here that I hope help:
Probable cause seems to be the singleton migration: In both https://github.com/DataDog/dd-sdk-flutter/issues/596#issuecomment-2103935904 and my issue https://github.com/DataDog/dd-sdk-flutter/issues/552 mention version 2.1.0
as the first version containing this issue. IIRC that was when the singleton was introduced.
but I feel like its possible there are situations I don't know about that could legitimately create a second Flutter engine.
The default launchMode
for Android Flutter applications is singleTop
. The docs here do mention several scenarios in which a new instance of the MainActivity
would be created. Ours would, I think, fall under the scenario of a an Intent
being created in a new task (that therefore does not contain an instance of our Activity
), and I imagine there are a bunch of use cases where this could happen to other apps:
IIRC, each Flutter Engine by default is scoped to the MainActivity
so if you have 2 instances of that Activity
you'd have 2 flutter engines. Sans a singleton, this in theory should work fine but a singleton within the same process (without a workaround to avoid double-initialization) would I think mean running into this issue.
As a disclaimer, this is all theoretical and it's been a while since I've really been in the weeds on any Android code so please take what I'm saying with a hefty grain of salt.
Just to add a bit more information, my app has no custom native code from us, but we do have deeplinks, so you might be on something there...
I do agree the singleton migration was likely the cause, though it solved other issues related to engine initialization / shutdown.
I have a potential fix in mind that I'm going to look into implementing this week, but I'll likely release it as a preview release to get some feedback before mainlining the changes. I'll post to this thread when the preview version is available.
The newest version (2.6.0) no longer uses singletons to manage channel connections. Can some folks upgrade and see if this solves the issue?
Thanks @fuzzybinary. I put up a PR today to bump to 2.6.0 and will report back based on our Sentry numbers :) It should be fairly obvious whether this resolved the issue.
@fuzzybinary I no longer see the issue a week after releasing an app with the new sdk . Thank you!
Excellent! Thanks so much for letting me know. I'm going to close this but if anyone sees this issue again please reach out.
Confirming that we're seeing the same, a decrease in instances of this issue. Thanks for the work on this @fuzzybinary!
Stack trace
Fatal Exception: io.flutter.plugins.firebase.crashlytics.FlutterError: MissingPluginException(No implementation found for method addError on channel datadog_sdk_flutter.rum) at MethodChannel._invokeMethod(platform_channel.dart:332) at ._willHandleError(helpers.dart:14)
Reproduction steps
Add the datadog_flutter_plugin package, release to App Store
Volume
0,0021 (1-2 users per day)
Affected SDK versions
2.4.0
Does the crash manifest in the latest SDK version?
Yes
Flutter Version
3.19.5
Setup Type
Flutter Application
Device Information
OS - Android per version: Android 12 - 88% Android 10 - 7% Android 14 - 3% Android 13 - 2%
per device: Samsung - 93% Oneplus - 7%
Other relevant information
Device states: background 60%