Socket hang up with Lambda with JS client

birtles commented 1 year ago

I have a Lambda@Edge function (origin request) that uses @libsql/client to query Turso.

It initializes the client in the global scope (i.e. not in the handler) as is common practice in Lambdas so you can re-use the client across subsequent invocations.

e.g.

const dbClient = createClient({
  url: DB_URL,
  authToken: DB_AUTH_TOKEN,
});

export async function handler(
  event: CloudFrontRequestEvent
): Promise<CloudFrontRequestResult> {
  // Use `dbClient` here
}

Testing just now, my first request was processed correctly but on the second request I got:

2023-09-30T02:44:21.779Z    cdd6ee8d-b573-40e8-8871-224a0551d468    ERROR   Invoke Error    
{
    "errorType": "FetchError",
    "errorMessage": "request to https://jpdict-dev-birchill.turso.io/v2/pipeline failed, reason: socket hang up",
    "code": "ECONNRESET",
    "message": "request to https://jpdict-dev-birchill.turso.io/v2/pipeline failed, reason: socket hang up",
    "type": "system",
    "errno": "ECONNRESET",
    "stack": [
        "FetchError: request to https://jpdict-dev-birchill.turso.io/v2/pipeline failed, reason: socket hang up",
        "    at ClientRequest.<anonymous> (/var/task/index.js:7054:18)",
        "    at ClientRequest.emit (node:events:514:28)",
        "    at TLSSocket.socketCloseListener (node:_http_client:474:11)",
        "    at TLSSocket.emit (node:events:526:35)",
        "    at node:net:323:12",
        "    at TCP.done (node:_tls_wrap:588:7)"
    ]
}

Is there some lifetime to the connection I need to be aware of? Some way to force a persistent connection? Or should I just create a new client for each invocation?

For what it's worth I am using version 0.3.5 but I never saw this error with version 0.3.4 so I suppose it might be related to using libsql instead of better-sqlite3.

birtles commented 1 year ago

Actually, even after moving the initialization of the client into the handler I still encounter this so it doesn't appear to be related to long-lived client instances.

penberg commented 12 months ago

Hey @birtles, you are importing @libsql/client/web, right? The 0.3.5 version didn't change any of that code so that doesn't explain the difference in behaviour. Long-lived connections from serverless platforms are often chopped so usually people indeed open the connection from the handler.

Let's keep this issue open though, maybe there's something on the client side we can do to improve this.

birtles commented 12 months ago

Hey @birtles, you are importing @libsql/client/web, right?

Hi! I'm just importing @libsql/client. Presumably it's resolving to @libsql/client/node as it's running on Node (Lambda).

The 0.3.5 version didn't change any of that code so that doesn't explain the difference in behaviour. Long-lived connections from serverless platforms are often chopped so usually people indeed open the connection from the handler.

Perhaps I just never tested 0.3.4 enough to come across this then. If the best practice is to open the connection from the handler (i.e. on each request) then I can live with that. If there's a way to persist it/re-establish it, however, all the better.

birtles commented 12 months ago

Hmm, I'm still getting occasional socket hang ups even when creating the client from the handler. I might try using a regular Lambda instead to see if it's particular to Lambda@Edge.

birtles commented 11 months ago

I rewrote the Lambda as a regular Lambda in both TS and Rust and I still see the hang-ups in the Node (TS) version but I've never seen them with the Rust one. This is even when the client is created inside the handler.

It looks like this is related to https://github.com/aws/aws-sdk-js-v3/issues/1196. One comment on that issue mentions:

Either way, ECONNRESET is a retry-able error and should be handled that way.

So I guess our TS client needs to be looking for and retrying errors, perhaps here I guess?

At the moment I think the only way to work around this is for users to wrap every call to execute etc. in try ... catch and do the retrying themselves.

(Updated to clarify that this is the case even when the client is created inside the handler.)

penberg commented 8 months ago

@birtles Sorry for joining the party late, but here's first take on making the fetch() retry: https://github.com/libsql/hrana-client-ts/pull/9 I still need to convince myself that it's safe, but if anyone experiencing this issue can test the pull request, that would help a lot.

birtles commented 8 months ago

@birtles Sorry for joining the party late, but here's first take on making the fetch() retry: libsql/hrana-client-ts#9 I still need to convince myself that it's safe, but if anyone experiencing this issue can test the pull request, that would help a lot.

Thanks! Like @mateusfreira pointed out, that doesn't look right to me since it's not waiting on the fetch.

Unfortunately I've switched to Rust for the Lambda I was encountering this on (largely because the TS client was so unreliable) so I'm not able to verify the fix I'm afraid.

tursodatabase / libsql-client-ts

Socket hang up with Lambda with JS client #110