googlearchive / firebase-jobdispatcher-android

DEPRECATED please see the README.md below for details.
Apache License 2.0
1.79k stars 209 forks source link

Deadlock occurring when scheduling a job #206

Closed Amokrane closed 4 years ago

Amokrane commented 6 years ago

Hello,

Since we started using firebase-jobdispatcher we noticed a big increase in the number of ANRs in our application. The ANR trace that we got shows that a deadlock occurs when a job is about to get scheduled. Specifically, two threads are blocked competing for the same locks (runningJobs and serviceConnection) that they acquire in a different order.

Here is the thread dump that shows what happens:

"main" prio=5 tid=1 Blocked | group="main" sCount=1 dsCount=0 obj=0x74bc0710 self=0x7fc30a7fba00 | sysTid=11068 nice=0 cgrp=default sched=0/0 handle=0x7fc30e184200 | state=S schedstat=( 0 0 0 ) utm=56 stm=21 core=1 HZ=100 | stack=0x7fff94fac000-0x7fff94fae000 stackSize=8MB | held mutexes= at com.firebase.jobdispatcher.ExecutionDelegator.onJobFinishedMessage(ExecutionDelegator.java:140)

  • waiting to lock <0x02b2cdc9> (a android.support.v4.e.k) held by thread 56 at com.firebase.jobdispatcher.ExecutionDelegator.access$000(ExecutionDelegator.java:36) at com.firebase.jobdispatcher.ExecutionDelegator$1.jobFinished(ExecutionDelegator.java:73) at com.firebase.jobdispatchger.JobService$JobCallback.sendResult(JobService.java:296) at com.firebase.jobdispatcher.JobService$2.run(JobService.java:168)
  • locked <0x034b0cce> (a android.support.v4.e.k) at android.os.Handler.handleCallback(Handler.java:739) at android.os.Handler.dispatchMessage(Handler.java:95) at android.os.Looper.loop(Looper.java:148) at android.app.ActivityThread.main(ActivityThread.java:5417) at java.lang.reflect.Method.invoke!(Native method) at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:726) at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:616)

"pool-17-thread-1" prio=5 tid=56 Blocked | group="main" sCount=1 dsCount=0 obj=0x12ef56a0 self=0x7fc2f3877800 | sysTid=11175 nice=0 cgrp=default sched=0/0 handle=0x7fc2f14ba440 | state=S schedstat=( 0 0 0 ) utm=0 stm=0 core=1 HZ=100 | stack=0x7fc2f13b8000-0x7fc2f13ba000 stackSize=1037KB | held mutexes= at com.firebase.jobdispatcher.JobService.stop(JobService.java:183)

  • waiting to lock <0x034b0cce> (a android.support.v4.e.k) held by thread 1 at com.firebase.jobdispatcher.JobService$1.stop(JobService.java:114) at com.firebase.jobdispatcher.JobServiceConnection.stopJob(JobServiceConnection.java:123)
  • locked <0x0c2f255e> (a com.firebase.jobdispatcher.JobServiceConnection) at com.firebase.jobdispatcher.JobServiceConnection.onStop(JobServiceConnection.java:110)
  • locked <0x0c2f255e> (a com.firebase.jobdispatcher.JobServiceConnection) at com.firebase.jobdispatcher.ExecutionDelegator.stopJob(ExecutionDelegator.java:130)
  • locked <0x02b2cdc9> (a android.support.v4.e.k) at com.firebase.jobdispatcher.GooglePlayReceiver.onSchedule(GooglePlayReceiver.java:303)
  • locked <0x0bf54a19> (a android.support.v4.e.k) at com.firebase.jobdispatcher.GooglePlayDriver.schedule(GooglePlayDriver.java:87) at com.firebase.jobdispatcher.FirebaseJobDispatcher.schedule(FirebaseJobDispatcher.java:85) at com.firebase.jobdispatcher.FirebaseJobDispatcher.mustSchedule(FirebaseJobDispatcher.java:122) at com.comuto.services.AppFirebaseJobScheduler.schedule(AppFirebaseJobScheduler.kt:14) at com.comuto.pushnotifications.PushTokenSyncScheduler.schedule(PushTokenSyncScheduler.kt:16) at com.comuto.v3.service.InstanceIDListenerService.onTokenRefresh(InstanceIDListenerService.java:23) at com.google.firebase.iid.FirebaseInstanceIdService.handleIntent(unavailable:-1) at com.google.firebase.iid.zzc.run(unavailable:-1) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1113) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:588) at java.lang.Thread.run(Thread.java:818)

And this is how we are defining our JobService:

public class SendPushTokenJobService extends JobService {

    @Inject FirebaseInstanceId firebaseInstanceId;
    @Inject UserRepository userRepository;
    @Inject AppboyConfigurationProvider appboyConfigurationProvider;

    @Override
    public boolean onStartJob(JobParameters job) {
        BlablacarApplication.getAppComponent().inject(this);
        sendPushToken();
        return false;
    }

    @Override
    public boolean onStopJob(JobParameters job) {
        return false;
    }

    @SuppressWarnings("CheckReturnValue")
    protected void sendPushToken() {
        Maybe.defer(() -> {
            String token = firebaseInstanceId.getToken();
            if (token == null) {
                return Maybe.empty();
            } else {
                return Maybe.just(token);
            }
        })
            .subscribeOn(Schedulers.io())
            .observeOn(Schedulers.io())
            .flatMap(token -> userRepository.addPushId(new Device(token))
                .firstElement()
                .map(result -> token))
            .subscribe(token -> {
                appboyConfigurationProvider.registerAppboyPushMessages(token);
            }, Timber::e);
    }
}

The job is created and scheduled this way:

val jobToScheduled = firebaseJobDispatcher.newJobBuilder().setService(SendPushTokenJobService::class.java).setTag(""send-push-token-tag"").setRetryStrategy(RetryStrategy.DEFAULT_EXPONENTIAL).build()

I looked to check whether this was a known issue or whether this same error was already seen in other projects but didn't find anything.

We are using the version 0.8.5.

Any thoughts?

Thank you.

Nook2007 commented 6 years ago

Version 0.8.2 - same issue. Related https://github.com/firebase/firebase-jobdispatcher-android/issues/190

tuckercr commented 6 years ago

I was able to resolve this by having my JobService extend SimpleJobService instead of JobService and then moving my code from onStartJob to onRunJob in which SimpleJobService runs in a single background thread.

tasneembohra commented 6 years ago

Not sure how SimpleJobService will resolve this as internally it dispatches job to worker thread using AsyncTask. I am also facing this issue, I am running this inside Async itself.

cooliceman commented 5 years ago

I was able to resolve this by having my JobService extend SimpleJobService instead of JobService and then moving my code from onStartJob to onRunJob in which SimpleJobService runs in a single background thread.

This is not a solution, SimpleJobService extends JobService, so there's no difference between use SimpleJobService and JobService.

mobsome commented 5 years ago

Guys, please review your code because it is very likely that you're causing this deadlock. I had exactly the same issue with mine code. Where is the issue? In calling JobDispatcher methods from JobService (main thread) and other thread at the same time. Have in mind that JobDispatcher internally calls some callbacks on main thread (like onStopJob). So, here is the stack:

Thread A (non-main):

  1. Calls FJobDispatcher::schedule
  2. Internal lock is acquired
  3. FB schedules some callback on JobService (i.e. startJob / stopJob).
  4. Wait for the JobService 'runningJob' lock (acquired by main thread)

at the same time: Thread B (JobService, main thread):

  1. Running on main thread
  2. JobService 'runningJob' lock is acquired
  3. Calls FJobDispatcher methods on main thread
  4. Waits for internal FB lock which is currently acquired by Thread A

and we have deadlock. Solution: Don't call FBJobDispatcher methods from JobService on main thread.

samtstern commented 4 years ago

In April 2019 we announced that Firebase Job Dispatcher would be deprecated today, April 7th 2020. For this reason we are going to close all open issues and archive the repository. We will also stop supporting FJD for new app installations (such as those targeting sdk versions greater than Android Q). Existing apps that rely on FJD should continue to function as usual.

While we will no longer work on FJD, we encourage you to migrate to Work Manager which is a superior tool and fully supported by the Android team.

Thank you for taking the time to try the library and provide this feedback. We sincerely appreciate your effort and we apologize for not addressing it in time.