launchdarkly / node-server-sdk

LaunchDarkly Server-side SDK for Node
Other
79 stars 65 forks source link

Timeout is fixed at 300000ms #239

Open Kenny407 opened 2 years ago

Kenny407 commented 2 years ago

Hello 👋🏽
Describe the bug I specify a timeout for the client and then I keep getting this error

 warn: [LaunchDarkly] Received I/O error (Read timeout, received no data in 300000ms, assuming connection is dead) for streaming request - will retry
[api] info: [LaunchDarkly] Will retry stream connection in 1000 milliseconds
[workers] warn: [LaunchDarkly] Received I/O error (Read timeout, received no data in 300000ms, assuming connection is dead) for streaming request - will retry
[workers] info: [LaunchDarkly] Will retry stream connection in 1000 milliseconds
[api] TypeError: The "listener" argument must be of type function. Received an instance of Object

To reproduce I'm creating a client once in my app like this:

export const ldClient = LaunchDarkly.init(`${LAUNCH_DARKLY_API_KEY}`, {
  offline: NODE_ENV === Env.TEST, // to avoid calling LD while performing tests
  timeout: 24 * 60 * 60, // 1 day 
})

Yet I still face the error mentioned above, is there any good practice to keep a larger timeout? as it makes us restart our API in local dev.

Expected behavior To keep a longer timeout at least for our local dev environment.

SDK version ^6.2.2

Language version, developer tools Typescript: ^4.3.5

OS/platform MacOS 12.2.1 (Monterey)

Node Version Node 16.4.0 (LTS)

kinyoklion commented 2 years ago

Hello @Kenny407

The timeout being adjusted here does not affect the specific timeout you are encountering. It will affect the timeout for making individual requests, but the timeout for stream connections is 5 minutes. If the stream doesn't receive any data from LD for 5 minutes, then the connection is dropped and it will try to reconnect. (Under normal conditions the stream will always receive data. Even if there are no flag updates the service will send periodic messages to keep the connection alive.)

What are the conditions which are triggering these disconnects? (Debugging for instance.) What is the version of node you are using?

What about it makes you restart your local dev? Does it not resume getting events, or are you encountering some additional problems?

Thank you, Ryan

vvo commented 2 years ago

We are also experiencing this issue under these conditions:

We are periodically receiving these errors:

Any help appreciated

eli-darkly commented 2 years ago

@vvo This sounds a lot like a network issue, or at least something that's likely to have more to do with details of your operating environment than it has to do with the Node SDK itself, and it's hard to diagnose those here where we can't ask you for any configuration/runtime information that you wouldn't want to post in a public forum. In general, you may have better luck contacting our support team at https://support.launchdarkly.com.

As maintainers of the Node SDK, Ryan and I can only provide some potentially relevant background information. Ryan already mentioned why the timeout it is talking about is unrelated to the timeout that you were setting— and even if this timeout were configurable, there would be no benefit in setting it to a higher number, because "there's been no data in 5 minutes" is really just another way of saying the connection is not working at all. So the real question is why is the connection failing sometimes (assuming that this is an intermittent issue, and that it works OK at other times).

One possibility is that it is really a network issue. If so, that is in the realm of AWS and impossible for us to diagnose. You could try testing connectivity in between that host and stream.launchdarkly.com using a plain HTTP client or TCP tools instead of the SDK, and see if there is anything unusual there.

Another possibility is that there's a software/OS issue that is interfering with connections. I do see one unusual thing here: the message TypeError: The "listener" argument must be of type function. That's an error message I've seen in Node before in situations where the standard Node function https.request is being called with wrong parameters— it could certainly be something else, since there are plenty of functions that have a parameter called "listener", but if it is that one, there are two ways I know that that can happen:

  1. The code is written for the wrong version of Node. However, assuming your information about the AWS environment is right, we know that our SDK does work with Node 14. And I don't see how that could be the problem anyway if the error only sometimes happens; if there were a compatibility problem, I would expect it to be incompatible all the time.
  2. Some third-party code is intercepting https.request, for instance to add some kind of middleware behavior, and it is interfering with the parameters. We had to release a patch several versions back due to an issue like this, as discussed here; the problematic tool in that case was Mock Service Worker, but there are likely other products with similar issues (due to how easy it is in Node to monkey-patch standard functions), so I can't say for sure that our workaround solved this for all of them. However, as above, I'm not sure how such a thing could make it fail only sometimes.
eli-darkly commented 2 years ago

You don't by any chance use Google Cloud Trace, do you?

https://github.com/launchdarkly/node-server-sdk/issues/233#issuecomment-1005753886

Kenny407 commented 2 years ago

Hello @eli-darkly, @kinyoklion thanks for the response. No we don't use Google Cloud Trace.

I have started to suspect that it's also related to my laptop going into sleep mode. Because it doesn't happen in our production or staging servers, and from our logs we can see that there's no interaction with the launch darkly sdk in timeframes larger than 300000ms

By the way we are using Node 16.14.0

eli-darkly commented 2 years ago

from our logs we can see that there's no interaction with the launch darkly sdk in timeframes larger than 300000ms

I don't think this part is relevant. Again, the 5-minute timeout it's referring to here is a socket read timeout on the long-lived streaming connection that normally stays alive throughout the lifetime of the SDK. When your application interacts with the SDK to evaluate flags, that does not involve any kind of activity on that connection; the connection is managed in the background to receive flag updates from LD if and when they occur.

wilau2 commented 2 years ago

Seems to be logged a lot in a lambda serverless environment. Will look into swallowing that warning.

kenny-alvizuris-cko commented 2 years ago

@wilau2 I found the behaviour to be consistently happening while having my server running, then putting my mac to sleep and unlocking it again. In production only found it while not receiving any requests, usually during low traffic windows.

joehoppe commented 1 year ago

I'm seeing this on my lambdas as well "launchdarkly-node-server-sdk": "^6.4.3"