cloudflare / workers-sdk

⛅️ Home to Wrangler, the CLI for Cloudflare Workers®
https://developers.cloudflare.com/workers/
Apache License 2.0
2.41k stars 593 forks source link

🐛 BUG: Obscure hangs while using hyperdrive and postgres.js #6179

Open boehs opened 3 days ago

boehs commented 3 days ago

Which Cloudflare product(s) does this pertain to?

Pages, Other

What version(s) of the tool(s) are you using?

Wrangler 3.62.0, Postgres.js 3.4.4

What version of Node are you using?

No response

What operating system and version are you using?

MacOS Sonema

Describe the Bug

Hi, this is mostly a "journey" of all the sort of issues I've faced, they are all sort of related as far as I can deduce.

I am using the latest version of postgres.js, wrangler, nodejs_compat, and compatibility versions.

[!NOTE] I also tried pg in the hopes that I'd be able to get more reasonable errors to figure out what the actual catalyst was, but pg had its own obscure error: TypeError: this.stream.once is not a function. I don't want this to distract from the rest of what's below, but just wanted to put that out there

We start by buying a hetzner server, and spinning up postgres on it. We get TLS working, so that the hyperdrive configuration is accepted. It says its been created.

Lets give it a go! When I start it up, I see

✘ [ERROR] Uncaught TypeError: Cannot read properties of undefined (reading 'replace')

      at queryError
  (file:///Users/evan/Code/pinefore/.wrangler/tmp/pages-vfuZ5f/chunks/runtime.mjs:1:871780)

Relevant Postgres.js issues

This is annoying enough, but the worst part is that it just causes the worker to hang for 30 seconds, eating up compute:

image

It's hard to draw connections between the error and the hanging because they feel independent and sometimes I don't even get an error to go off, but they seem very related. This hanging and error both seems to be because of some sort of issue related to database connection, but this issue of course shouldn't exist because hyperdrive accepted the connection and I am quite confident the connectionString is being pulled and used correctly.

I switch over to local development. Same issue with hanging using the following:

[[hyperdrive]]
binding = "HYPERDRIVE"
id = "5470ba839d3c47d7a9de80ebf5ee2bab"                              # the ID associated with the Hyperdrive you just created
localConnectionString = "postgresql://pf:pf@localhost:5432/pinefore"

The kysely developer mentions downgrading the library here could at least fix the hang which is caused by, apparently, invalid configuration, and I try that. Not only did it fix the hang, I see the following in the console for the first time! A helpful error!

image

But wait... Why does the role postgres need to exist? Look at that local connection string? Still, we can make this role in development... and finally everything works! It needed the postgres role for some reason. But that doesn't help for production. Again, my connection string does not involve the postgres role. So to sum

  1. All SQL queries are causing my worker to hang for 30 seconds without any output
  2. It would appear these hangs are caused by a connection failure, but there are no helpful logs
  3. The only time I've gotten a helpful log, it seems like it was about something that shouldn't have mattered

I feel so blind!! Please help me!

The pages project is eae1c1315f0d309e7327336526f065ed. The hyperdrive is 5470ba839d3c47d7a9de80ebf5ee2bab. The DB config looks something like

export const db = (connectionString?: string) =>
  new Kysely<Database>({
    dialect: new PostgresJSDialect({
      connectionString,
      postgres,
    })
  });

Thank you so much!

Please provide a link to a minimal reproduction

No response

Please provide any relevant error logs

No response

boehs commented 2 days ago

We've found the resolution, an IP was used instead of a hostname, causing connection failures and causing postgres.js to hang. The takeaways are as follows:

  1. This ought to be documented
  2. At least miniflare doesn't support special characters as a password, it needs to be A-z0-9, maybe a couple others, despite the fact postgres supports it
  3. I don't know how much control you have over postgres.js but this hang on a failed connection is really really really annoying
  4. It would be helpful if hyperdrive logs could be exposed somehow to get better visibility into the cause of these failures