launchdarkly / js-core

LaunchDarkly monorepo for JavaScript SDKs
Other
15 stars 19 forks source link

[LaunchDarkly] identify error: LaunchDarklyTimeoutError: identify timed out after 5 seconds on IOS #516

Open ryan-brs opened 4 months ago

ryan-brs commented 4 months ago

Describe the bug [LaunchDarkly] identify error: LaunchDarklyTimeoutError: identify timed out after 5 seconds

To reproduce Use this component as a Provider

import {
  LDProvider,
  ReactNativeLDClient,
  AutoEnvAttributes,
} from "@launchdarkly/react-native-client-sdk";
import { useEffect } from "react";

const LaunchDarklyProvider = ({ children }: PropsWithChildren): ReactNode => {

  const client = new ReactNativeLDClient(MOBILE_KEY, AutoEnvAttributes.Disabled);

  useEffect(() => {
    const setup = async (): Promise<void> => {
      if (rowKey && token) {
        await client
          .identify({
            kind: 'multi',
            practice: {
              kind: 'practice',
              key: rowKey,
            },
          })
          .then(() => console.log('identify finished'))
          .catch((e: unknown) => console.log('client error', e));
      }
    };

    setup();
  }, [rowKey, token]);

  return <LDProvider client={client}>{children}</LDProvider>;
};

export default LaunchDarklyProvider;

Expected behavior The error should not show and should be able to get the correct variation value.

Logs LOG error: [LaunchDarkly] identify error: LaunchDarklyTimeoutError: identify timed out after 5 seconds

Also enabled debug mode and get these logs debug: [LaunchDarkly] [EventSource] Will open new connection in 0 ms. LOG info: [LaunchDarkly] Closed LaunchDarkly stream connection

SDK version "@launchdarkly/react-native-client-sdk": "^10.2.1"

Language version, developer tools "react-native": "0.74.3" "expo": "51.0.18" "expo-router": "3.5.17"

OS/platform IOS simulator 17.5 Real device 17.5.1

Additional context Can see the context from the website context section

kinyoklion commented 4 months ago

Hello @ryan-brs,

The start and identify methods support a configurable timeout that you can provide. The goal of the timeout is to ensure applications do not wait indefinitely for initialization. If an applications waits indefinitely for initialization, then it can an take indefinite time for the user to be able to use an application.

You can add a timeout to the identify options.

, { timeout: 10 }

The SDK continues to attempt to initialize in the background regardless of the timeout. If your application doesn't block anything important waiting for the LD SDK, then you could set it to a large number to effectively never timeout.

It does tend to be the case that iOS seems to have more timeouts. We don't have any application specific code for iOS which would impact this.

The timeout also uses a custom type, so if you do not care about it timing out you could discard that type of exception.

I will look into the defaults to see if they need adjusted.

Thank you, Ryan

Filed internally as 250330

ryan-brs commented 4 months ago

Hello @ryan-brs,

The start and identify methods support a configurable timeout that you can provide. The goal of the timeout is to ensure applications do not wait indefinitely for initialization. If an applications waits indefinitely for initialization, then it can an take indefinite time for the user to be able to use an application.

You can add a timeout to the identify options.

, { timeout: 10 }

The SDK continues to attempt to initialize in the background regardless of the timeout. If your application doesn't block anything important waiting for the LD SDK, then you could set it to a large number to effectively never timeout.

It does tend to be the case that iOS seems to have more timeouts. We don't have any application specific code for iOS which would impact this.

The timeout also uses a custom type, so if you do not care about it timing out you could discard that type of exception.

I will look into the defaults to see if they need adjusted.

Thank you, Ryan

Filed internally as 250330

Hey Ryan,

Thanks for you reply, I have tried different timeout(even 300) but no lucky still get the same error

kinyoklion commented 4 months ago

If you have a user who opens your app, and there is not already a cached context, then it has to connect to LaunchDarkly to initialize. This presents many conditions. Lets say I open your app and I have no cell service, 5 minutes would still be within the time I may not have cell service.

So you will always get timeouts within a human timescale. I personally wouldn't consider it an error or log it as an error. It is like any other failed request, aside from the fact that we will automatically continue retrying for you.

There are several things that would be a problem:

ryan-brs commented 4 months ago

If you have a user who opens your app, and there is not already a cached context, then it has to connect to LaunchDarkly to initialize. This presents many conditions. Lets say I open your app and I have no cell service, 5 minutes would still be within the time I may not have cell service.

So you will always get timeouts within a human timescale. I personally wouldn't consider it an error or log it as an error. It is like any other failed request, aside from the fact that we will automatically continue retrying for you.

There are several things that would be a problem:

  • If the SDK doesn't eventually initialize, but internet is available
  • If you get the timeout after 5 seconds, even though you set the timeout to 5 minutes.
  • If the context is the same as a context that previously connected and initialized, then the result should already be cached. So it should not timeout.
  • It is saying that it timed out after 5 seconds, even though it took 5 minutes.

Pretty sure the connection is good as everything works well if I switch back to sdk 8.0.1. And context get sent to the contexts dashboard.

The timeout log looks fine as it is saying timeout error after the time I set.

I can also see the log "Identifying {kind: multi", rowKey: "rowkey"} so I'm assuming the initialize has been done? but this log is followed by two other logs LOG debug: [LaunchDarkly] [EventSource] Will open new connection in 0 ms. LOG info: [LaunchDarkly] Closed LaunchDarkly stream connection And timeout error will show after

kinyoklion commented 4 months ago

@ryan-brs

Thank you for the detail. We will take a look.

Thank you, Ryan

Filed internally as 250413

ayushmahajan12 commented 4 months ago

Hi Team

Can you please let us know what is the impact on user's related to the timeout issue. Over important release on hold due to the timeout failure.

Screenshot 2024-07-18 at 5 53 01 PM
ramakula commented 3 months ago

I'm sorry but this is a huge problem with the new JS SDK. We are switching back to native SDK. It is not possible to identify multiple times with the new SDK, it times out whereas the old native SDK never used to timeout. This error cannot be ignored because the real user identify fails and users are stuck with anonymous user flags.

Screenshot 2024-08-20 at 16 08 30

It is so easy to reproduce by sending different context data 4 to 5 times:

Screenshot 2024-08-20 at 10 33 54

Everything works perfectly with the old native SDK version 9.3.0. We will not migrate to the new SDK unless this is fixed.

kinyoklion commented 3 months ago

@ramakula Have you tried setting the timeout longer. The old SDK did not timeout because it was not a capability that it had. If you do not care about the timeout, then you can catch it. It doesn't affect the operation of the SDK.

This SDK does timeout, but all the timeout does is cause the function to reject the promise. The identify completes in the background, but this prevents the possibility of indefinitely waiting in the case that LaunchDarkly cannot be reach.

We generally recommend less than 15 seconds, but if you are experiencing long initialization times, then you can go longer than that.

I will again check that longer timeouts are working. We will also consider increasing the default.

Thank you, Ryan

kinyoklion commented 3 months ago

There is another consideration which is if you identify several contexts without waiting between, then the timeout applies to them individually. So if you identify 5 contexts and there isn't network access, then you get 5 timeouts.

I did verify the timeout does respect the time and the associated message will show the correct time.

ramakula commented 3 months ago

@kinyoklion I've tried up to 15 seconds. Timeout itself is not a big deal but we are seeing reports from the users that their features are not available and we found out that they are set to anonymous user. I'll try 30 seconds and get back to you with complete LD logs and why identify is set to anonymous etc. I'll also reach out to our main point of contact to setup a call so I can share more details with you guys. Appreciate all the tips.

krizpoon commented 2 months ago

In my case, there is a library in our project that intercepts traffic using the XMLHttpRequest object. Once the library is disabled, LD works again. That may explain why the native library works but not the RN library.

kinyoklion commented 2 months ago

Hello @krizpoon,

Is the library internal to your project, or part of a publicly available library? (Aside from flipper and expo issues which are already covered in the README.)

If it is a public library, then we would like to know which one, so we can add additional content to the readme, and potentially get the issue addressed.

Thank you, Ryan

krizpoon commented 2 months ago

Hey @kinyoklion, it's a public library: https://www.npmjs.com/package/msw

kinyoklion commented 2 months ago

@krizpoon Thank you. It doesn't look like msw can maintain the streaming capability of the original request. It looks like it is converting it to a fetch and then emulating the XmlHttpRequest behavior once it reads a full response. (I didn't exhaustively look through it, but this is my impression from a cursor look.)

I am not sure how feasible it would be to change it to use a readable stream from the fetch, which could support streaming. The alternative used in expo is to check if the request expects a streamed response and then to not intercept it.

This is generally the problem most of these interception libraries (including that in flipper and expo make), is that HTTP responses need to be read in completion. When real HTTP responses can be read indefinitely as a stream. So they connect to our streaming endpoint and attempt to read the "entire" body, but that will not complete (at least until there is an error or a timeout).

krizpoon commented 2 months ago

Thanks @kinyoklion. Another problem we are facing is that the network logging feature on BrowserStack is blocking the SSE traffic as well. Is there any known solution or work around please?

kinyoklion commented 2 months ago

If you can detect the environment you can change the initialConnectionMode to polling to disable streaming.

As tools that implement network interception, but implement it incorrectly, are not likely to be able to be easily worked around while retaining SSE. As they are modifying what is available in the environment, or modifying it at even a lower level (for instance in the Java implementation for android.)

I will see if I can make a minimal reproduction. The correct thing to do here, for BrowserStack and msw, is to submit issues that they don't support streaming HTTP requests.

Thank you, Ryan

krizpoon commented 2 months ago

Thanks @kinyoklion! Just realised that polling mode is now supported in the latest SDK release.