Race condition in use of Throttler

sadlerjw commented 3 years ago

Describe the bug I'm trying to track down what appears to be a race condition where my app never finishes bootstrapping and I think I've narrowed it down to a potential race condition in this SDK. From reading the code in this repo, it seems like LDClient's use of Throttler assumes that the contract is that the block passed to the throttler will always be executed. For example, setOnline calls internalSetOnline with a completion closure of group.leave - and internalSetOnline calls that completion closure as part of the block it asks the throttler to run. However it seems that the throttler will only ever execute the most recent block it's been asked to run. I think the root cause of my issue is that I'm calling LDClient.identify twice too close together (once based on locally saved information and once after we update user data from our server), and I rely on the completion block for the first one being called in order to allow the rest of my app to start. Since the documentation for identify doesn't specify that completion might never be called, I used it as if the contract was for it to always be called.

To reproduce Call LDClient.identify twice in a short time. The completion block to the first call is never called. Although I think it's reasonable to assume similar issues occur with other methods that use the throttler.

Expected behavior Whenever I call a method that has a completion block as one of its params, that block is always called eventually, unless explicitly documented otherwise.

SDK version 4.5, but I've read the code for 5.1 and the Throttler works in the same way.

Language version, developer tools Swift / Xcode 11.5 and Xcode 12 beta 6

OS/platform iOS 13.3

torchhound commented 3 years ago

Hi @sadlerjw thank you for reporting this. It is possible for a completion block to never be called if there is no internet connection.

sadlerjw commented 3 years ago

Thanks for the quick response! Can I suggest you change completion to onFlagsReceived or something? Or since that's a breaking API change, at least explicitly document it? It's pretty surprising (at least to me) that a completion block might never be called. Either way there is still a race (as far as I can tell) where two calls to identify will only call the completion block of the second call. My expectation would be that even if LD does some throttling under the hood, that both calls would have their completion blocks invoked when the operation(s) complete.

torchhound commented 3 years ago

Hi @sadlerjw thanks for the documentation suggestion. I think this is a good area for improvement, if you have further doc improvements please share them. As far as the 2 identify calls, is it possible to create a LDUser object with your locally saved information and pass that object to the SDK initialization and then call identify once the app starts? There is an identify call inside init but it is not throttled and thus shouldn't affect your second call.

sadlerjw commented 3 years ago

Yeah, I've gone and done a workaround, which wasn't too difficult, but what we were doing, in order to speed up app start, was:

Show a loading UI and bootstrap the app
Call identify with the user information we have stored locally.
Wait for this to finish before we replace the loading UI with the main app UI - we want to ensure all feature flags are correct for this user before the main UI launches
Launch the main app UI from this identify call's completion block
Make a call to our API to update data for our user (effectively a /users/me API call) so we can apply correct permissions and stuff within our app
Since we have updated user information, we call identify again.

So depending on how long step 5 takes (based on network conditions, latency on our server, etc) we see different behaviour. If it's short (most of the time), the main app fails to ever load, because the completion block from step 3 never gets called. If it's long, everything works just fine.

Our solution was to eliminate steps 3 and 4, and delay app start until after steps 5 and 6. This obviously isn't ideal, but it works...except, based on your comment that the completion block is never called if there's no network connection, we'll never successfully launch our app if the device is offline.

The solution you've suggested here is definitely better and eliminates this specific problem. Is the completion block on start guaranteed to be called, even if the device is offline?

All that said, it's my opinion (and recognizing you understand your product better than I do) that the completion block should always be called eventually, even if the user is offline, or even if a call to identify is being superseded by another call.

torchhound commented 3 years ago

Glad my solution was helpful. The completion block on start is not guaranteed to be called. However we have implemented a version of start that will time out if no flags are returned within the given time frame for any reason. Simply add startWaitSeconds: TimeInterval to your start call. I hope this timeout assuages your concern, please feel free to continue this thread or reach out our support team with any questions or concerns.

torchhound commented 3 years ago

Hi @sadlerjw, I hope this issue is resolved. If not please reach out to our support staff, I'm also happy to answer any further questions here.

launchdarkly / ios-client-sdk

Race condition in use of Throttler #223