launchdarkly / ios-client-sdk

LaunchDarkly Client-side SDK for iOS (Swift and Obj-C)
https://docs.launchdarkly.com/sdk/client-side/ios
Other
70 stars 84 forks source link

Upgrading to 2.14.2 (and 3.0.3) makes start hang #178

Closed pcoltau closed 5 years ago

pcoltau commented 5 years ago

Describe the bug After upgrading this library from 2.14.1 to 2.14.2 (and also up to 3.0.3), my app is hanging in the call to LDClient.start().

To reproduce The following code reproduces the issue:

        let userBuilder = LDUserBuilder()
        userBuilder.key = userId
        userBuilder.customString("apiVersion", value: apiVersion)
        userBuilder.customString("appBrandname", value: appBrandname)

        let config = LDConfig(mobileKey: apiKey)
        config.debugEnabled = enableDebug
        LDClient.sharedInstance().start(config, with: userBuilder)

Expected behavior That the call to start will not block and successfully return, as it did with version 2.14.1 of this library.

Logs No logs

SDK version 2.14.1

Language version, developer tools XCode 10.2.1 Swift 4.2

OS/platform iOS 12.2

Additional context None

markpokornycos commented 5 years ago

@pcoltau Thank you for submitting this issue. I have attempted to recreate the hang using the setup you provided using one of our sample apps, hello-ios-swift. I installed the setup code you provided into that app's AppDelegate.setupLDClient(). I used the following values for the items in your setup code:

        let userId = "test@email.com"
        let apiVersion = "apiVersionValue"
        let appBrandname = "appBrandnameValue"
        let apiKey = <a mobile key to an environment we set up>
        let enableDebug = true

I believe these values represent the types you have in your setup code, please verify that is correct. NOTE: I'm not asking you to provide the actual values, just that the types line up.

I ran hello-ios-swift pointed to sdk versions 3.0.3 and 2.14.2. In both cases the app didn't hang, but showed the value of the feature flag, and changed when the flag's value changed.

I realized I'm a bit unclear what precisely you mean by "my app is hanging". Is there an execution loop that is causing the app to become unresponsive? Is the app waiting for something from the sdk?

An obvious thing to verify is that you have set a delegate on the primary environment, and that the delegate implements the ClientDelegate methods. Then run the app and let us know which delegate methods (if any) are called after startup.

Assuming the delegate is setup and enableDebug is true, can you send us the log entries from the sdk? That may give me some more information about what the sdk is doing.

When the app hangs, can you break and see where the sdk is in its execution? Does it show signs of an execution loop that you can describe to us? (You will likely have to do this several times to determine that...)

We can also move the discussion to support@launchdarkly.com if you don’t want to share details publicly.

pcoltau commented 5 years ago

Thank you for your feedback!

enableDebug is set to false.

I've paused the app while hanging and identified that it is hanging in the following place:

Screenshot 2019-06-11 at 09 59 33

LDEventSource.m line 272 (version 2.14.2 of LaunchDarkly).

This indicates that the problem is the fact that I run LDClient.sharedInstance().start() on a concurrent background queue (com.apple.root.background-qos (concurrent)).

I then wrapped the call to .start() in an async code block on the Main Thread, and that seems to fix the issue.

markpokornycos commented 5 years ago

@pcoltau I spent some time trying to duplicate this, but without success. What I think is that the call to CFRunLoopRun() is unnecessary. It's code I didn't write, and so finding out why it was put there has been a challenge. If you are willing to try this experiment, it would help us out since I am not able to duplicate the hang.

  1. Remove the change you made to dispatch start() to the main thread so that you get the hang again.
  2. Open LDEventSource.m. Comment out the three lines that test the thread and call CFRunLoopRun().
  3. Clean, build, & run. Verify the sdk opens a streaming connection and the sdk does not hang. Verify the sdk responds to feature flag changes.

I think the call to CFRunLoopRun() is leftover from a time when the iOS-eventsource was creating its own background threads that needed to have this call. However, it uses NSURLSession today, and so there is no need for this call. In my tests, I was able to kick off a background thread that calls start() on the sdk with different wait times from 0s to 10s. The sdk always started without hanging and made a streaming connection. I verified the app responded to feature flag changes in each case.

If you can run this experiment and let us know the results, then we can proceed with confidence that the call is not required and remove it in another ios-eventsource version.

pcoltau commented 5 years ago

@markpokornycos

Thank you for your time investigating this issue.

I have done some further testing, and I have noticed the following when running on a background queue (including the call to CFRunLoopRun()):

1) There seems to be a wait time, as you have also noticed. Sometimes it is 0s and other times it seems to run forever (e.g. 30+ seconds). 2) If I have any code after the call to start() (located on the same background queue), that code is never called. It seems like the "current thread" is haltet after the call to CFRunLoopRun() is made.

Furthermore, I've done as you suggested: Removed the call to CFRunLoopRun() and that does seem to fix the issue! 👍- both running on a background queue and also running on the Main queue.

torchhound commented 5 years ago

Version 3.0.4 has been released with a new version of eventsource that fixes this bug.