firebase / firebase-admin-java

Firebase Admin Java SDK
https://firebase.google.com/docs/admin/setup
Apache License 2.0
545 stars 269 forks source link

FirebaseMessaging sendEachAsync(), sendEachForMulticastAsync() can cause thread deadlock #981

Closed jhkim-grip closed 2 months ago

jhkim-grip commented 2 months ago

[REQUIRED] Step 2: Describe your environment

[REQUIRED] Step 3: Describe the problem

Steps to reproduce:

We use custom thread pool due to OOM problem of default thread manager that sdk used.

private ExecutorService createThreadPoolExecutor() {
        final int corePoolSize = 50; 
        final int maximumPoolSize = 50; 
        final long keepAliveTime = 60L;
        final TimeUnit unit = TimeUnit.SECONDS;
        final BlockingQueue<Runnable> workQueue = new LinkedBlockingQueue<>(500);

        return new ThreadPoolExecutor(
                corePoolSize,
                maximumPoolSize,
                keepAliveTime,
                unit,
                workQueue,
                new ThreadPoolExecutor.CallerRunsPolicy()
        );
    }

FirebaseOptions.Builder().setCredentials(GoogleCredentials.fromStream(serviceAccount))
                    .setThreadManager(fcmThreadManager)
                    .build();

but in FirebaseMessaging sendEachOpAsync(), sendEachForMulticastAsync() both parent method itself use thread pool by callAsync() and inside there is child method sendOpForSendResponse() that actually call the fcm send api also use the same thread pool by callAsync()

스크린샷 2024-08-23 오후 2 25 18 스크린샷 2024-08-23 오후 2 25 46

the problem is thread pool can be full of parent method and then it will be forever wait status cause there is no actual message sending thread.

it did not reproduce heavily we have 15 pods and only 1 pod have problem. we get thread dump from that pod and we get all 50 thread wait in same line where parent thread that wait all the child thread to end.

at java.util.concurrent.locks.LockSupport.park(java.base@11.0.16/Unknown Source)
    at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:502)
    at com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:83)
        at com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:62)
    at com.google.firebase.messaging.FirebaseMessaging$2.execute(FirebaseMessaging.java:231)

Relevant Code:

i think thread code of above will be enough. we just have thread pool setting code and FirebaseMessaging call code.

google-oss-bot commented 2 months ago

I couldn't figure out how to label this issue, so I've labeled it for a human to triage. Hang tight.

jonathanedey commented 2 months ago

Thank you @jhkim-grip for spotting this!