Closed RamblinWreck77 closed 3 years ago
hi, our one and only @RamblinWreck77 , great to see you again! As always, we've been super grateful for your contribution to this repo! Thank you so much for such a deep analysis.
archive()
in the 3 app lifecycle callbacks in 3.0.0.beta so we can safely remove it to make things a little bit easier like we don't have to worry about applicationWillTerminate
anymore. applicationDidEnterBackground
and applicationWillEnterForeground
, I think it's legit to put any long running task into a background thread rather than the main thread. I agree it's challenging to 100% guarantee the job to be finished as it's non-deterministic. However It's a trade-off we have to make, as a 3rd party library, we are not meant to block the main thread for any reason, even for applicationDidEnterBackground
, see this issue 91. taskId
and there is room to improve beginBackgroundTask
and endBackgroundTask
.
I will address those issues together with #479. In the meantime, you are more than welcome to put a PR or raise any issues.@zihejia Sounds good! I'll close this then :) Have a good one!
Hey team, I've been chasing a "Task had still not ended" crash on-background for a while I believe I have finally tracked down the cause.
In MixpanelInstance.swift lines 428 -> 490 we have 3 app lifecycle-related delegate callbacks that either crash or cause other issues:
1) applicationDidEnterBackground
beginBackgroundTask needs to be called as soon as we detect a didEnterBackground. Per it's documentation:
I'd recommend calling it first and moving everything else in this function below it
when applicationDidEnterBackground is called on main, and the following code paths are async:
trackingQueue is a utility QoS serial queue. The app is shutting down/going to background, and when main hits this function we're basically queuing up some work on a low-priority background thread that will (almost certainly) never execute before we're terminated. We're also starting a background task synchronously on main and killing it async on that same low-priority thread. I've addressed similar issues in my code base by adding an "immediate" flag on functions (defaulting to false) that allows sync-dispatch under the hood. It's a bit more work but it's nice to have the option.
Either way, I'd recommend
A) Remove line 451's async dispatch to end the task entirely, since the termination condition on another queue (while main is running flush/archive?) is non-deterministic
and
B) move the endBackgroundTask to inside beginBackgroundTask's expiration handler. iOS should always call this expiration handler only once. This way on every background we start a task, queue up some work, have anywhere from 20-120 seconds to let it run, and when iOS calls back to let us know it's time to die we synchronously (very important!) mark the task as ended within that block's context to prevent iOS from deciding our entire process must be killed. I've also read iOS does some shady behind-the-scenes logic on your background behavior, and getting killed for not ending tasks properly will penalize your app. iOS will then grant you less time (or none at all!) for future tasks.
2) applicationWillEnterForeground
here we have an async dispatch to trackingQueue to end the background task. A problem I see is taskId is an unprotected var being changed on main in some functions and utility queue (not-main) in others. This could introduce subtle race/corruption issues if main is changing taskId when/if trackingQueue also touches self.taskId to make changes. In my experience multi-thread r/w on unprotected vars in swift creates a "hot potato" that later crashes with a nonsense stack trace in unrelated code. Best to completely avoid!
My recommendation here is to treat taskId as a main-thread-only var, and only access it synchronously inside these delegate callbacks (which will always be on main afaik!). Once we can force Swift 5.5, the new @MainActor actor isolation feature could be useful here!
3) applicationWillTerminate
This is, IMO, the largest problem. When iOS calls this function you're about to get killed, and when the function's context exits that's it.
EVERYTHING in this function needs to be synchronously called, otherwise you can be sure it will not reliably execute. I'd be willing to be that async dispatch to trackingQueue for archive() has a near-100% failure rate. Could this cause data/event losses on terminate? While we're at it, we should also synchronously end any background tasks (if they exist) in here.
If you're interested, I could put together a PR against 3.0.0.beta with these changes to discuss further. I just wanted to raise these issues and get some feedback before starting to see if there's anything I've missed!