firebase / quickstart-unity

Firebase Quickstart Samples for Unity
https://firebase.google.com/games
Apache License 2.0
819 stars 424 forks source link

JniResultCallback.onCompletion Deadlocked, ANR In Firebase Unity SDK 8.9.0 #1286

Closed fdspbsteam closed 1 year ago

fdspbsteam commented 2 years ago

[READ] For Firebase Unity SDK issues, please report to Firebase Unity open-source

Once you've read this section and determined that your issue is appropriate for this repository, please delete this section.

[REQUIRED] Please fill in the following fields:

Unity editor version: 2020.3.22f1 Firebase Unity SDK version: 8.9.0 Source you installed the SDK: .unitypackage Problematic Firebase Component: Cloud Messaging Other Firebase Components in use: Auth, Database, Remote Config Additional SDKs you are using: Facebook, IronSource, Adjust Platform you are using the Unity editor on: Mac Platform you are targeting: Android and iOS Scripting Runtime: IL2CPP

[REQUIRED] Please describe the issue here:

2022-06-14_13-42-09 image image

image image

Steps to reproduce:

Possible steps:

  1. run the app from background, maybe push

    Relevant Code:

// TODO(you): code here to reproduce the problem
paulinon commented 2 years ago

Hi @fdspbsteam,

Version 9.1.0 of the SDK has been released recently. Could you try it out and see if it resolves the issue? If it doesn't please provide the complete steps you've performed in order to replicate this behavior so that we can identify what's causing this.

fdspbsteam commented 2 years ago

@paulinon sorry, am I right, your suggestion is to update and check it in production?

paulinon commented 2 years ago

That's right, @fdspbsteam. It's possible that the issue is resolved when an updated SDK is used.

I look forward to your results.

fdspbsteam commented 2 years ago

@paulinon honestly, we don't test potential fixes on our audience. Can you confirm that you have fixed this issue in the release?

paulinon commented 2 years ago

Hi @fdspbsteam,

It's advisable that you use the latest SDK version so that you'll have the latest bug fixes and updates, and it's possible that the fix for this issue is included. Otherwise, we'll be needing more information (such as a minimal, reproducible example) to identify what's causing this.

fdspbsteam commented 2 years ago

@paulinon thanks for clarification, the stack trace in the summary of issue added, I suppose it's enough for a fix.

paulinon commented 2 years ago

The stack trace you provided might be helpful for this investigation, @fdspbsteam. For now, let's see if the issue persists in the latest SDK version.

Looking forward to hearing from you soon.

fdspbsteam commented 2 years ago

@paulinon Thanks for clarification, I would like to wait until Firebase team fix this issue.

chkuang-g commented 2 years ago

@fdspbsteam

Thank you for the info. However the information you provided is not enough for us to pinpoint the issue.

TLDR explanation is that JniResultCallback has been used by ANY C++/Unity API which wraps around Android API which returns Task, which can be...almost any async APIs. Therefore, that can be an issue from any Firebase product (Cloud Messaging, Auth, Database, Remote Config) in your project.

I would need a bit of your help to pinpoint the issue, at least to understand which API is causing the issue.

  1. Are you able to reproduce the issue locally? According to this back-and-forth, I assume the answer is No? If you happen to be able to reproduce the issue, could you see if you can find any information from the log?
  2. If local reproduction is not available, let's analyze the code. Could you provide code snippets int your project which is using Firebase APIs? Since the deadlock occurs in onCompletion, you can narrow down to something like OnComplete() or ContinueWith() or ContinueWithOnMainThread(), and see if you are using Firebase APIs within those blocks.

Thank you, Shawn

chkuang-g commented 2 years ago

Ah, and, since you are using 8.9.0, it is probably NOT related to https://github.com/firebase/quickstart-unity/issues/1151

google-oss-bot commented 2 years ago

Hey @fdspbsteam. We need more information to resolve this issue but there hasn't been an update in 5 weekdays. I'm marking the issue as stale and if there are no new updates in the next 5 days I will close it automatically.

If you have more information that will help us get to the bottom of this, just add a comment!

fdspbsteam commented 2 years ago

Hi,

Are you able to reproduce the issue locally? sorry, no

AlmostMatt commented 2 years ago

As chkuang-g mentioned above, could you provide code snippets of relevant Firebase API usage? Since the deadlock occurs in onCompletion, you can narrow down to methods like OnComplete() or ContinueWith() or ContinueWithOnMainThread(), and see if you are using Firebase APIs within those blocks.

fdspbsteam commented 2 years ago

added more info here image image

paulinon commented 2 years ago

Hi @fdspbsteam,

While these error logs provide some context related to your issue, it would be great if you provide code snippets of relevant Firebase API usage so that this investigation can move forward.

fdspbsteam commented 2 years ago

@paulinon sure, FirebaseAnalytics.SetAnalyticsCollectionEnabled(true); void InitializeFirebase() { DebugLog("Enabling data collection."); FirebaseAnalytics.SetAnalyticsCollectionEnabled(true); FirebaseAnalytics.SetUserId("xxxx"); FirebaseAnalytics.SetSessionTimeoutDuration(new TimeSpan(0, 30, 0)); firebaseInitialized = true; }

paulinon commented 2 years ago

Thanks for the input, @fdspbsteam. Since JniResultCallback can come from any Firebase product, could you also provide the code pertaining to Messaging, Realtime Database, Remote Config?

fdspbsteam commented 2 years ago

@paulinon we are only using Messaging part,

 public static event EventHandler<MessageReceivedEventArgs> MessageReceived
    FirebaseMessaging.MessageReceived += OnNotificationMessageReceived;
paulinon commented 2 years ago

Hi @fdspbsteam,

Could you provide the OnNotificationMessageReceived function in your code? We would better understand your implementation if you provide entire functions that use Firebase.

Also, does it mean that the only Firebase product you have in your implementation is Messaging?

fdspbsteam commented 2 years ago
private void OnNotificationMessageReceived(object sender, MessageReceivedEventArgs args)
    {
        FMLogger.vCore("Push received");

        if (args != null && args.Message != null)
        {
            string pushId = string.IsNullOrEmpty(args.Message.MessageId) ? "null" : args.Message.MessageId;
            AppState.PushNotificationId = pushId;

            var data = args.Message.Data;
            if (data != null)
            {
                foreach (var kv in data)
                {
                    if (kv.Key == APP_MESSAGE_KEY)
                    {
                        string queue = kv.Value;
                        var msg = UrlSchemeParser.TryParse(queue, AppMessageSource.PushNotification);
                        if (msg != null)
                        {
                            NotificationAppMessageReceived?.Invoke(msg, pushId);
                            return;
                        }
                    }
                }
            }
        }
        else
        {
            Debug.LogError("empty");
        }
    }
**Also, does it mean that the only Firebase product you have in your implementation is Messaging?**
yep
paulinon commented 2 years ago

Thanks for the additional input, @fdspbsteam. Could you also provide the function in your implementation that handles Firebase initialization?

fdspbsteam commented 2 years ago

@paulinon sure, already here https://github.com/firebase/quickstart-unity/issues/1286#issuecomment-1171550951

paulinon commented 1 year ago

Thanks for the additional information, @fdspbsteam. I've relayed this to the team so that they could take a look at this issue. With that, I'll be marking this as a bug for now.

abogarsukov-braingames commented 1 year ago

Are there any news? We still observer this issue in v9.4.0.

Heartkage commented 1 year ago

This issue remains in newest Unity-Firebase version 10.5.0

fdspbsteam commented 1 year ago

@paulinon can we stop the game "update to a new version" and just give the developers time to fix the issue ? )

Heartkage commented 1 year ago

hi @paulinon, just some information the team might want to know.

Screenshot 2023-03-01 at 12 03 45 PM

We are not able to reproduce the BUG by ourself, but according to Log, we found that our users seem to encounter this BUG few seconds after launching the game. All of them were not able to pass this line of code.

Screenshot 2023-03-01 at 12 15 22 PM

That was the only firebase call after we launch the game

Note 1. This is not call in Awake() or Start(). Note 2. 57% of this ANR happened in background. (User might be switching app to background while Initializing?)

gclaret commented 1 year ago

Any update on this? we have this issue as well on production and is causing almost all of our ANRs

Heartkage commented 1 year ago

@gclaret hey, there is a work-around for us, maybe you wanna try. You can set Firebase.FirebaseApp.LogLevel to something other than default before check dependency.

*Note: It only reduces amount, but does not fix the problem

image
chkuang-g commented 1 year ago

Hi folks,

We had a hard time reproduce this. The issue was not surfaced through our CI. And it is pretty hard to tell where this deadlock occurs in the C++/Unity SDK.

I think it is worth exploring this a bit.

  1. When you register for the notification, ex. FirebaseMessaging.MessageReceived += OnNotificationMessageReceived;, where did you do so? Is it before or during CheckAndFixDependencyAsync.ContinueWith()? It would be great if you can share the code snippet around this.
  2. What are you doing during the callback? Ex. During private void OnNotificationMessageReceived(object sender, MessageReceivedEventArgs args) {}. Are you using any other Firebase API during this callback? Does it help if you defer the use of those APIs for the next main thread update?

This would really help us to narrow down the cause of this ANR.

Thank you, Shawn

raviSM99 commented 1 year ago

I am also receiving ANR in firebase version 10.6.0 Unity

main (blocked):tid=1 systid=16620 | waiting to lock <0x08305084> (com.google.firebase.app.internal.cpp.JniResultCallback) held by thread 71
       at com.google.firebase.app.internal.cpp.JniResultCallback.onCompletion(JniResultCallback.java:149)
       at com.google.firebase.app.internal.cpp.JniResultCallback$TaskCallback.onSuccess(JniResultCallback.java:67)
       at com.google.android.gms.tasks.zzm.run(com.google.android.gms:play-services-tasks@@18.0.2:1)
       at android.os.Handler.handleCallback(Handler.java:938)
       at android.os.Handler.dispatchMessage(Handler.java:99)
       at android.os.Looper.loopOnce(Looper.java:233)
       at android.os.Looper.loop(Looper.java:344)
       at android.app.ActivityThread.main(ActivityThread.java:8212)
       at java.lang.reflect.Method.invoke(Native method)
       at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:584)
       at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1034)
Thread-128 (blocked):tid=71 systid=17464 | waiting to lock <0x0f32b06d> (java.lang.Object) held by thread 1
       at com.google.firebase.app.internal.cpp.JniResultCallback$TaskCallback.disconnect(JniResultCallback.java:102)
       at com.google.firebase.app.internal.cpp.JniResultCallback.onCompletion(JniResultCallback.java:152)
       at com.google.firebase.app.internal.cpp.JniResultCallback.cancel(JniResultCallback.java:143)

Please suggest on how to handle this scenario , I am using Firebase analytics , Crashlytics and Remote notification feature.

FirebaseApp.CheckAndFixDependenciesAsync ().ContinueWithOnMainThread (task =>
        {

            var dependencyStatus = task.Result;
            if (dependencyStatus == Firebase.DependencyStatus.Available)
            {

                m_App = Firebase.FirebaseApp.DefaultInstance;
                OnFirebaseAppInitialized?.Invoke ();
            }

        });

I have tried ContinueWith as well instead of ContinueWithOnMainThread still the issue persists. as per firebase analytics all of these ANR are occurring in foreground.

google-oss-bot commented 1 year ago

Hey @fdspbsteam. We need more information to resolve this issue but there hasn't been an update in 5 weekdays. I'm marking the issue as stale and if there are no new updates in the next 5 days I will close it automatically.

If you have more information that will help us get to the bottom of this, just add a comment!

fdspbsteam commented 1 year ago

all data is presented above.

AlmostMatt commented 1 year ago

Based on the stack traces it is apparent that the main thread is evaluating TaskCallback.onSuccess() and the second thread is evaluating JniResultCallback.cancel(). The end result is that both threads try to obtain the same pair of locks but in the opposite order.

It is unclear what is causing cancel() to occur at the same time as onSuccess(). cancel() is generally called when an app is being closed and the FirebaseApp is being destroyed. Many of the code samples provided mention initialization code such as CheckAndFixDependenciesAsync. Perhaps there is some case where a failed initialization results in cancel being called early, or perhaps there is some edge case if an app is manually closed while initialization is still in progress.

However, regardless of what is causing onSuccess and cancel to be called at the same time, I think the deadlock can be avoided if the JniResultCallback code is changed to use a single shared lock instead of a pair of locks. I am marking this as a bug to change the usage of locks to avoid deadlocking.

CSystem commented 1 year ago

This issue remains in newest Unity-Firebase version 10.7.0

timbo-tj commented 1 year ago

We're seeing a lot of these errors and it is pushing our ANR rate above the bad behaviour threshold..!

Is there any update on this? Our packages are as follows - all up to date as far as I can tell

    "com.google.android.appbundle": "file:com.google.android.appbundle-1.9.0.tgz",
    "com.google.external-dependency-manager": "file:com.google.external-dependency-manager-1.2.175.tgz",
    "com.google.firebase.analytics": "file:com.google.firebase.analytics-10.7.0.tgz",
    "com.google.firebase.app": "file:com.google.firebase.app-10.7.0.tgz",
    "com.google.firebase.messaging": "file:com.google.firebase.messaging-10.7.0.tgz",
    "com.google.play.common": "file:com.google.play.common-1.8.2.tgz",
    "com.google.play.core": "file:com.google.play.core-1.8.2.tgz",
    "com.google.play.review": "file:com.google.play.review-1.8.1.tgz"

The callstack is

  at com.google.firebase.app.internal.cpp.JniResultCallback.onCompletion (JniResultCallback.java:149)
  at com.google.firebase.app.internal.cpp.JniResultCallback$TaskCallback.onSuccess (JniResultCallback.java:67)
  at com.google.android.gms.tasks.zzm.run (com.google.android.gms:play-services-tasks@@18.0.2:1)
  at android.os.Handler.handleCallback (Handler.java:942)
  at android.os.Handler.dispatchMessage (Handler.java:99)
  at android.os.Looper.loopOnce (Looper.java:226)
  at android.os.Looper.loop (Looper.java:313)
  at android.app.ActivityThread.main (ActivityThread.java:8757)
  at java.lang.reflect.Method.invoke (Native method)
  at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run (RuntimeInit.java:571)
  at com.android.internal.os.ZygoteInit.main (ZygoteInit.java:1067)
vg-swift commented 1 year ago

@gclaret hey, there is a work-around for us, maybe you wanna try. You can set Firebase.FirebaseApp.LogLevel to something other than default before check dependency.

*Note: It only reduces amount, but does not fix the problem image

In our case setting log level to Debug decreased ANRs, even though it doesn't make much sense FirebaseApp.LogLevel = LogLevel.Debug;

timbo-tj commented 1 year ago

Will try and report back in a few days once the data appears on the dashboard

xeonsrs commented 1 year ago

The following might probably help Firebase team to identify the problem. We use Firebase SDK 10.0.1.. we've started seeing this ANR once we updated to Unity 2020.3.40. Firebase SDK version remains unchanged in our case.

danilavanilla commented 1 year ago

Version 11.0.0 has released. Somebody updated, it can help to decrease amount of ANRs?

xeonsrs commented 1 year ago

Version 11.0.0 has released. Somebody updated, it can help to decrease amount of ANRs?

Normally it does not work in that way. Firebase should mention this in their release notes If they fixed something related, It does not make sense to update the SDK based on just an assumption.

vg-swift commented 1 year ago

@chkuang-g @paulinon what exactly are you trying to reproduce? Would love to give you all the information you need.

This is an ANR, which means that the main thread was blocked for longer than 5s, which means that Firebase initialization just takes a while, doesn't mean that it failed to initialize. image

Device list image image image image image

From what I see it happens to devices on lower end, which makes sense, bcs they have a worse CPU, so Firebase initialization takes longer (doesn't explain S22 though).

The issue seems to happen inside: FirebaseApp.CheckAndFixDependenciesAsync() In our case we don't use ContinueWith() or ContinueWithOnMainThread(), but are just awaiting it. Based on the logs the deadlock happens before the FirebaseApp.CheckAndFixDependenciesAsync() completes, and this is the very first thing we call in our app.

I'm not sure if there's anything specific for you to fix, but maybe you guys could at least look into optimizing the initialization of Firebase? It seems like this would improve the issue for all of us.

ehaight commented 1 year ago

We just upgraded to FIrebase 10.7.0, and our ANR rate doubled, pushing us into bad behavior. This ANR is new to the build that aligns with Firebase upgrade.

Has anyone found a workaround here? Previously, we were on version 9.4.0 and not experiencing this ANR.

"main" tid=1 Blocked at com.google.firebase.app.internal.cpp.JniResultCallback.onCompletion (JniResultCallback.java:149) at com.google.firebase.app.internal.cpp.JniResultCallback$TaskCallback.onSuccess (JniResultCallback.java:67) at com.google.android.gms.tasks.zzm.run (com.google.android.gms:play-services-tasks@@18.0.2:1) at android.os.Handler.handleCallback (Handler.java:938) at android.os.Handler.dispatchMessage (Handler.java:99) at android.os.Looper.loop (Looper.java:263) at android.app.ActivityThread.main (ActivityThread.java:8296) at java.lang.reflect.Method.invoke (Native method) at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run (RuntimeInit.java:612) at com.android.internal.os.ZygoteInit.main (ZygoteInit.java:1006)

"Thread-29" tid=61 Blocked at com.google.firebase.app.internal.cpp.JniResultCallback$TaskCallback.disconnect (JniResultCallback.java:102) at com.google.firebase.app.internal.cpp.JniResultCallback.onCompletion (JniResultCallback.java:152) at com.google.firebase.app.internal.cpp.JniResultCallback.cancel (JniResultCallback.java:143)

timbo-tj commented 1 year ago

We are also in bad behaviour territory through no fault of our own and this is rather frustrating.

The log level hack did not improve our ARN rate whatsoever, unfortunately..

Were not sure what to do besides remove firebase which will mean we lose analytics and cloud messaging for an entire platform so that's really not an option.

timbo-tj commented 1 year ago

It appears as though updating our Unity version from 2021.3.10f1 to 2021.3.24f1 has worsened the ANR rate.

yassirmetacore commented 1 year ago

Although we have a simple Firebase setup we are also facing the same issue in our game, the only 2 classes that are referencing Firebase are as follow :

using Firebase.Crashlytics;
      using UnityEngine;

  public class FirebaseInitializer : MonoBehaviour
  {
      public static bool IsReady;
      private void Awake()
      {
          Firebase.FirebaseApp.CheckAndFixDependenciesAsync().ContinueWith(task => {
              var dependencyStatus = task.Result;
              if (dependencyStatus == Firebase.DependencyStatus.Available) {
                  // Create and hold a reference to your FirebaseApp,
                  // where app is a Firebase.FirebaseApp property of your application class.
                  var app = Firebase.FirebaseApp.DefaultInstance;
                  Crashlytics.ReportUncaughtExceptionsAsFatal = true;
                  Crashlytics.SetUserId(MetaHub.Plr.PlayerId);  
                  // Set a flag here to indicate whether Firebase is ready to use by your app.
                  IsReady = true;
              } else {
                  UnityEngine.Debug.LogError(System.String.Format(
                      "Could not resolve all Firebase dependencies: {0}", dependencyStatus));
                  // Firebase Unity SDK is not safe to use here.
              }
          });
      }
  }

And this :

using System.Diagnostics;

namespace com.company.product {

    public static class PushNotificationsV2 {

        public static void OnTokenReceived(object sender, Firebase.Messaging.TokenReceivedEventArgs token) {
            Log.Debug($"Received token: {token.Token}");
            //some code
        }

        public static void RegisterPushes() {
            Firebase.Messaging.FirebaseMessaging.TokenReceived += OnTokenReceived;
        }

    }

}

Our Firebase version is 10.6 and Unity 2021.3.12f

a-maurice commented 1 year ago

Hi all,

I believe we've identified the root cause of the problem, and it will be addressed with the next release.

We weren't able to reproduce this, due to the nature of how inconsistent the issue seems to be, but thanks to the callstacks that you all provided, this issue will hopefully be fixed with this fix.

DFC005 commented 1 year ago

Any idea when the next version will be released?

a-maurice commented 1 year ago

@DFC005 I'm aiming for later this week, some of our dependencies had an update that delayed it a bit.

a-maurice commented 1 year ago

Alright, the release is out now, 11.1.0, https://github.com/firebase/firebase-unity-sdk/releases/tag/v11.1.0

I'm going to close this issue, to indicate that this should hopefully be fixed, at least this specific Messaging one. If there are still problems, feel free to either respond in this issue, or file a new one.