supabase-community / pg-gateway

Postgres wire protocol for the server-side
MIT License
170 stars 5 forks source link

Using pg-gateway, pglite & prisma #4

Open BracketJohn opened 3 months ago

BracketJohn commented 3 months ago

Improve documentation

I followed the pglite guide to get a local development-setup using pg-gateway, pglite and prisma running. I could not get it to work following the docs. I then tried to use the example/pglite of this repository to create a minimal reproduction. The minimal reproduction also did not work (see below).

Describe the problem

Minimum reproduction using the example/pglite fromm this repository.

First, clone the repo and install the required dependencies:

# clone this repo
git clone https://github.com/supabase-community/pg-gateway

# go into example dir
cd examples/pglite

# install dependencies
pnpm i 

# install prisma
pnpm install @prisma/client prisma

# create the prisma schema file
make prisma
touch prisma/schema.prisma

Add the following content into the schema.prisma:

datasource db {
  provider = "postgresql"
  url      = "postgres://postgres@localhost:5432"
}

model User {
  id    Int     @id @default(autoincrement())
}

Start the database:

pnpm dev

Attempt to push the schema in a parallel terminal:

> pnpm prisma db push

Prisma schema loaded from prisma/schema.prisma
Datasource "db": PostgreSQL database "postgres", schema "public" at "localhost:5432"

Error: Schema engine error:
unexpected message from server

Running it a second time still yields an error, albeit a different one:

❯ pnpm prisma db push
Prisma schema loaded from prisma/schema.prisma
Datasource "db": PostgreSQL database "postgres", schema "public" at "localhost:5432"

Error: Schema engine error:
ERROR: prepared statement "s0" already exists

Running these commands with DEBUG="*" set reveals that:

Describe the improvement

Add a prisma-subsection that specifies any extra steps to get prisma to work.

Additional context

Exact package versions this is running on:

❯ pnpm list
Legend: production dependency, optional only, dev only

pglite-example /Users/n/.tmp/pg-gateway/examples/pglite

dependencies:
@prisma/client 5.18.0
pg-gateway 0.3.0-alpha.3
prisma 5.18.0

devDependencies:
@gregnr/pglite 0.2.0-dev.8
@types/node 20.14.15
tsx 4.17.0
typescript 5.5.4

As I'm unsure whether this is related to this package at all, desired behavior or a bug I opened this as a documentation issue - I hope that's alright! Thanks for this great package, it looks super promising and I'd love to use it to create a local postgres-development setup without having to spin up a full docker container every time!

gregnr commented 3 months ago

Hey @BracketJohn thanks for the detailed issue. This is good to know - I haven't tried connecting prisma to pg-gateway yet so don't know exactly what might be going wrong. Regardless I am certainly open to adding this as another example. Give me a bit to debug this one and I'll get back to you πŸ˜ƒ

BracketJohn commented 3 months ago

Hey @gregnr πŸ‘‹

Big thanks for the reply & looking into it - glad, that the issue is helpful!

Looking forward to anything you may find,

have a good one! (:

BracketJohn commented 2 months ago

Hey @gregnr - quick q: Is there anything I can do to support resolving this / is there any insights you've already gathered? E.g., if it seems likely to you that it is unrelated to pg-gateway, I'd try to look into the other dependencies of the setup (:

Thanks and have a good one!

gregnr commented 2 months ago

Hey @BracketJohn, sorry for the slowness on this. We're in the middle of a pretty major refactor which I was hoping to get done before moving forward with more examples/tests, but it's taking longer than I expected.

I just took a stab at Prisma + pg-gateway + PGlite and am also receiving the same error as you:

unexpected message from server

Right now I'm trying to determine if this is a pg-gateway issue or a PGlite issue. In case you're interested in the details - I'm logging the protocol messages back and forth between Prisma and PGlite and noticing that the trip-up seems to happen during extended query protocol messages. For some reason PGlite responds with nothing (empty Uint8Array) after Prisma sends a Sync message, whereas IIUC it should be a ReadyForQuery message instead.

I'm going to continue digging and will keep you updated as I learn more. This is almost certainly a bigger issue than Prisma so will be important to get to the bottom of it.

gregnr commented 2 months ago

Just an update - I've narrowed the problem down to how PGlite handles the extended query protocol. A similar issue exists on PGlite that tracks this: https://github.com/electric-sql/pglite/issues/223

Once that is solved Prisma (and all other PG clients using extended queries) should JustWorkβ„’ πŸ˜ƒ

BracketJohn commented 2 months ago

Thanks for the updates & in-depth info - very helpful and interesting πŸ™ Will follow the fix of https://github.com/electric-sql/pglite/issues/223 closely then!

BracketJohn commented 2 months ago

So the pglite folks have fixed https://github.com/electric-sql/pglite/issues/223 🎊 There also was a release 0.2.6 after the fix was merged, but: the release notes do not contain / reference the fixing commit https://github.com/electric-sql/pglite/commit/1522c40d06def5c9f0719a6ef0f4e23e48077a1a.

Maybe just ambiguity, definitely worth for me to try out whether it works now! Will report back here, once I get to that.

BracketJohn commented 2 months ago

@gregnr I just tried out pg-gateway@0.3.0-beta.1, when installing it I get:

❯ pnpm i -D pg-gateway@0.3.0-beta.1
 ERR_PNPM_FETCH_404  GET https://registry.npmjs.org/@jsr%2Fstd__bytes: Not Found - 404

This error happened while installing the dependencies of pg-gateway@0.3.0-beta.1

@jsr/std__bytes is not in the npm registry, or you have no permission to fetch it.

I guess this is related to: https://github.com/supabase-community/pg-gateway/blob/4809cfc869144a95b8cc0ddcadd13a292916036f/packages/pg-gateway/package.json#L43-L47

Just sharing it in case you did not notice. I know that by using an unofficial beta-version this may be intended / unproblematic / fixed on a more mature release.

gregnr commented 2 months ago

@BracketJohn you're too fast πŸ˜† this bug was caused by those deps living under JSR and losing the @jsr:registry=https://npm.jsr.io scope after packaging/publishing. This is fixed now in 0.3.0-beta.2 πŸ‘

gregnr commented 2 months ago

With regard to PGlite - 0.2.6 does indeed fix the extended query issues, but we noticed one more bug where error messages do not follow up with a ReadyForQuery (causing clients to hang after errors). This has been fixed and hopefully published soon in another release.

BracketJohn commented 2 months ago

Good to hear! I saw that they just released 0.2.7 which includes the fix for the latest bug πŸ›

BracketJohn commented 2 months ago

I further test everything out with pg-gateway@0.3.0-beta.2, pglite@0.2.7 and prisma@5.16.2.

I learned the following:

[1] diff for readme example to work

- const server = net.createServer((socket) => {
+ const server = net.createServer(async (socket) => {

[2] error thrown on pglite + pg-gateway side

> vite-node prisma/pglite.ts

Server listening on port 5432
Error: write EPIPE
    at afterWriteDispatched (node:internal/stream_base_commons:161:15)
    at writeGeneric (node:internal/stream_base_commons:152:3)
    at Socket._writeGeneric (node:net:952:11)
    at Socket._write (node:net:964:8)
    at writeOrBuffer (node:internal/streams/writable:570:12)
    at _write (node:internal/streams/writable:499:10)
    at Socket.Writable.write (node:internal/streams/writable:508:10)
    at Object.write (node:internal/webstreams/adapters:215:63)
    at invokePromiseCallback (node:internal/webstreams/util:180:10)
    at node:internal/webstreams/util:185:23 {
  errno: -32,
  code: 'EPIPE',
  syscall: 'write'
}
Client disconnected

[3] prepared statement already exists error

[nitro] [unhandledRejection] PrismaClientUnknownRequestError: 
Invalid `prisma.positionTemplate.findFirst()` invocation:

Error occurred during query execution:
ConnectorError(ConnectorError { user_facing_error: None, kind: QueryError(PostgresError { code: "42P05", message: "prepared statement \"s19\" already exists", severity: "ERROR", detail: None, column: None, hint: None }), transient: false })
    at _n.handleRequestError (/home/runner/work/the-project/node_modules/.pnpm/@prisma+client@5.16.2_prisma@5.16.2/node_modules/@prisma/client/runtime/library.js:122:7117)
    at _n.handleAndLogRequestError (/home/runner/work/the-project/node_modules/.pnpm/@prisma+client@5.16.2_prisma@5.16.2/node_modules/@prisma/client/runtime/library.js:122:6235)
    at _n.request (/home/runner/work/the-project/node_modules/.pnpm/@prisma+client@5.16.2_prisma@5.16.2/node_modules/@prisma/client/runtime/library.js:122:5919)
    at async l (/home/runner/work/the-project/node_modules/.pnpm/@prisma+client@5.16.2_prisma@5.16.2/node_modules/@prisma/client/runtime/library.js:131:9116)
    at async seed (file:///home/runner/work/the-project/.nuxt/dev/index.mjs:9144:37)
    at async Object.run (file:///home/runner/work/the-project/.nuxt/dev/index.mjs:9279:24)
    at async runTask (file:///home/runner/work/the-project/.nuxt/dev/index.mjs:1732:17)
    at async file:///home/runner/work/the-project/.nuxt/dev/index.mjs:6377:3 {
  clientVersion: '5.16.2'
}

[4] Script used to pglite + pg-gateway:

import net from 'node:net'
import { PGlite } from '@electric-sql/pglite'
import { fromNodeSocket } from 'pg-gateway/node'
import { join } from 'pathe'

// create a single instance of the db, so that subsequent requests use the same db
const db = new PGlite({ dataDir: join(import.meta.dirname, 'pglite-data') })

const server = net.createServer(async (socket) => {
  await fromNodeSocket(socket, {
    serverVersion: '16.3',

    auth: {
      // No password required
      method: 'trust',
    },

    async onStartup() {
      // Wait for PGlite to be ready before further processing
      await db.waitReady
    },

    // Hook into each client message
    async onMessage(data, { isAuthenticated }) {
      // Only forward messages to PGlite after authentication
      if (!isAuthenticated) {
        return
      }

      // Forward raw message to PGlite and send response to client
      return await db.execProtocolRaw(data)
    },
  })

  socket.on('end', () => {
    console.info('Client disconnected')
  })
})

server.listen(5432, () => {
  console.info('Server listening on port 5432')
})
BracketJohn commented 2 months ago

I did further testing. While the pnpm prisma db push-command seems to work, albeit throwing an error, the pnpm prisma migrate command which is used to automatically generate & apply migrations (see docs here) does not seem to work: The command gets stuck. I'm using pglite@0.2.7 which should've resolved the hang-bug that sounds related to the problem I'm encountering here.

Here's a minimal reproduction: pg-gateway-migrate-stuck-reproduction.zip. The README.md contains the (few) steps to reproduce.

gregnr commented 1 month ago

Thanks for testing @BracketJohn. We did a bit of a deep dive into prisma and learned that prisma creates a shadow database during prisma migrate dev to detect schema drift. By default prisma creates this DB automatically via:

CREATE DATABASE "prisma_migrate_shadow_db_<uuid>"

Then opens a new connection to it concurrently with the regular DB. There are 2 issues going on:

  1. Our PGlite example currently ignores the database param sent by the client. We could grab this via clientParams state and pass it to PGlite in order to open the correct DB. But:
  2. PGlite is single-user, so concurrent connections to a single instance is not possible. Normally your options are:
    • Create a new PGlite instance every connection. This only really works if you want an ephemeral DB each connection that doesn't persist.
    • Reject more than 1 connection to the DB
    • Queue connections so that only one connection operates at a time

You could experiment with the queuing strategy, but I have a hunch prisma expects both connection to run concurrently. Another approach is to create a second temporary PGlite instance dedicated to act as the shadow DB, then manually configure prisma to use this as the shadow DB. It would roughly look like:

  1. Configure prisma manually with a shadow DB:

    datasource db {
     provider          = "postgresql"
     url               = env("DATABASE_URL")
     shadowDatabaseUrl = env("SHADOW_DATABASE_URL")
    }

    where:

    DATABASE_URL=postgres://postgres@localhost/postgres
    SHADOW_DATABASE_URL=postgres://postgres@localhost/prisma-shadow
  2. In pg-gateway, check to see which DB the client is connecting to and direct to the appropriate instance (untested):

    import { PGlite } from '@electric-sql/pglite'
    import { createServer } from 'node:net'
    import { join } from 'node:path'
    import { fromNodeSocket } from 'pg-gateway/node'
    
    // create a single instance of the db, so that subsequent requests use the same db
    const db = new PGlite({ dataDir: join(import.meta.dirname, 'pglite-data') })
    
    const server = createServer(async (socket) => {
      let activeDb = db;
    
      await fromNodeSocket(socket, {
        serverVersion: '16.3',
    
        auth: {
          // No password required
          method: 'trust',
        },
    
        async onStartup({ clientParams }) {
          // create a temp in-memory instance if connecting to the prisma shadow DB
          if (clientParams?.database === 'prisma-shadow') {
            activeDb = new PGlite();
          }
    
          // Wait for PGlite to be ready before further processing
          await activeDb.waitReady
        },
    
        // Hook into each client message
        async onMessage(data, { isAuthenticated }) {
          // Only forward messages to PGlite after authentication
          if (!isAuthenticated) {
            return
          }
    
          // Forward raw message to PGlite and send response to client
          return await activeDb.execProtocolRaw(data)
        },
      })
    
      socket.on('end', () => {
        console.info('Client disconnected')
      })
    })
    
    server.listen(5432, () => {
      console.info('Server listening on port 5432')
    })
AndresGnu commented 1 month ago
  • prepared statement already exists

My temporal solution for [3] prepared statement already exists is add this param ?pgbouncer=true in DATABASE_URL

https://www.prisma.io/docs/orm/prisma-client/setup-and-configuration/databases-connections/pgbouncer

BracketJohn commented 1 month ago

My temporal solution for [3] prepared statement already exists is add this param ?pgbouncer=true in DATABASE_URL

~Thanks, I'll try this out.~

update: Just tried it: The current 0.3.0-beta version already resolves the prepared statement-problem in a better fashion than adding the pgbounce-param! @AndresGnu I suggest you try that version out, see the next branch for up-to-date instal instructions: https://github.com/supabase-community/pg-gateway/pull/21


Hey @gregnr!

Thanks for taking the time & doing a deep dive into the reported topics, very helpful! πŸ’― I think the shadow DB strategy could work and will try that one out to resolve our migration-woes.

One thing I do not understand yet: Why does pnpm prisma db push cause an error to be logged, see my point here:

pnpm prisma db push leads to an error, see [2]

and the error that is shown documented here:

[2] error thrown on pglite + pg-gateway side

I'm pretty sure that pnpm prisma db push does not need a shadow-db to work: If it would, it would also get stuck. So, what is going on for the documnted error to be caused?

AndresGnu commented 1 month ago

@BracketJohn I am using version 0.3.0-beta.3. But in my case I am not managing only one prisma connection.

image This is a diagram of my case

When I connect to only one of my two services it works correctly. But when the other service connects and starts sending requests I have the error of [3] prepared statement already exists

Implementing the pgbounce parameter still generates internal errors but they do not break the flow as it does without the parameter

This is my implementation.


const base = process.env.APP_DATA || '';
console.log(base);

const dbPath = resolve(base, './data/bot/db');
fs.ensureDirSync(dbPath);

function createDb() {
  return new PGlite(dbPath, {
    // debug: 5,
    // database: 'bot',
    extensions: { uuid_ossp },
  });
}
let closing = false;
let db = createDb();
const server = net.createServer(async (socket) => {
  // console.log(sockt.remoteAddress,socket);
  const conection = `${socket.remoteAddress}:${socket.remotePort}`;
  console.log('Client connected', conection);
  if (closing) {
    console.log('waiting for closing');
    await waitForFunction(async () => !closing);
  }
  if (!db) {
    db = createDb();
  }

  await fromNodeSocket(socket, {
    serverVersion: '16.3',

    auth: {
      // No password required
      method: 'trust',
    },

    async onStartup({ clientParams }) {
      console.log(clientParams?.database);

      console.log('onStartup');
      await waitForFunction(async () => {
        console.log('onStartup: waiting for transaction');
        return !db.isInTransaction();
      }, 100);
      // Wait for PGlite to be ready before further processing
      await db.waitReady;
    },

    // Hook into each client message
    async onMessage(data, { isAuthenticated }) {
      // Only forward messages to PGlite after authentication
      if (!isAuthenticated) {
        return;
      }
      await waitForFunction(async () => {
        console.log('onMessage: waiting for transaction');
        return !db.isInTransaction();
      }, 100);
      // Forward raw message to PGlite and send response to client
      const r = await db.execProtocolRaw(data);
      return r;
    },
  });

  socket.on('end', async () => {
    console.log('Client disconnected', conection);
    await waitForFunction(async () => {
      return !db.isInTransaction();
    });
    closing = true;
    await db?.close().catch(console.error);
    db = null;
    closing = false;
  });
});

server.listen(+process.env.PORT, () => {
  console.log('Server listening on port 5472');
});
BracketJohn commented 1 month ago

So, I've done some extensive testing and playing around with different pg-gateway + pglite solutions on 3 of our internal production-grade projects. The TLDR of our experiences is:

Here are the prisma settings that worked best for us:

  // This `env(..)` value will be read from the `.env` file or from the environment.
  url = "postgres://postgres@localhost:5432/postgres?pgbouncer=true&connection_limit=1"

  // These two are required for development only. NOTE: If we ever use connection poolers such as `pgbouncer` or prisma accelerate, we'll need to make the `directUrl` dynamic and set it in production, see `directUrl` description here: https://www.prisma.io/docs/orm/reference/prisma-schema-reference
  shadowDatabaseUrl = "postgres://postgres@localhost/prisma-shadow?pgbouncer=true&connection_limit=1"
  directUrl         = "postgres://postgres@localhost:5432/postgres?connection_limit=1"

The exact combinations of connection_limit and pgboucner flags are important.

Here is the final script we used to run the database, supporting the shadow database approach:

/**
 * Script that starts a postgres database using pg-gateway (https://github.com/supabase-community/pg-gateway) and pglite (https://github.com/electric-sql/pglite).
 *
 * We use this database for local development with prisma ORM. The script also supports creating a `shadow-database`, which is a second, separate database
 * that prisma uses for certain commands, such as `pnpm prisma migrate dev`: https://www.prisma.io/docs/orm/prisma-migrate/understanding-prisma-migrate/shadow-database.
 *
 * To make use of the shadow-database add `/prisma-shadow` to the DSN you provide. This script will then spin up a second, in-memory-only database and connect you to it.
 *
 * This whole script approach is novel to us (before we used sqlite locally). Here is the PR that brought it all together: https://github.com/sidestream-tech/REDACTED
 */
import net from 'node:net'
import { PGlite } from '@electric-sql/pglite'
import { fromNodeSocket } from 'pg-gateway/node'

const db = new PGlite({ dataDir: 'pglite-data' })
let activeDb = db

const server = net.createServer(async (socket) => {
  activeDb = db

  console.info(`Client connected: ${socket.remoteAddress}:${socket.remotePort}`)
  await fromNodeSocket(socket, {
    serverVersion: '16.3',

    auth: {
      // No password required
      method: 'trust',
    },

    async onStartup({ clientParams }) {
      // create a temp in-memory instance if connecting to the prisma shadow DB
      if (clientParams?.database === 'prisma-shadow') {
        console.info(`Connecting client to shadow database`)
        activeDb = new PGlite()
      }

      // Wait for PGlite to be ready before further processing
      await activeDb.waitReady
    },

    // Hook into each client message
    async onMessage(data, { isAuthenticated }) {
      // Only forward messages to PGlite after authentication
      if (!isAuthenticated) {
        return
      }

      // Forward raw message to PGlite and send response to client
      return await activeDb.execProtocolRaw(data)
    },
  })

  socket.on('end', () => {
    console.info('Client disconnected')
  })
})

server.listen(5432, () => {
  console.info('Server listening on port 5432')
})

With this setup every command we tested worked flawlessly. You can see that prisma nicely respects the connection limit. The only "off" thing that still occurs is this error on the first pnpm prisma db push run:

Error: write EPIPE
    at afterWriteDispatched (node:internal/stream_base_commons:161:15)
    at writeGeneric (node:internal/stream_base_commons:152:3)
    at Socket._writeGeneric (node:net:952:11)
    at Socket._write (node:net:964:8)
    at writeOrBuffer (node:internal/streams/writable:570:12)
    at _write (node:internal/streams/writable:499:10)
    at Socket.Writable.write (node:internal/streams/writable:508:10)
    at Object.write (node:internal/webstreams/adapters:215:63)
    at invokePromiseCallback (node:internal/webstreams/util:180:10)
    at node:internal/webstreams/util:185:23 {
  errno: -32,
  code: 'EPIPE',
  syscall: 'write'
}

This error seems to be inconsequential and does not lead to any problems we've experienced.

With this, we are happy to use pg-gateway + pglite as a solution for our local development setups.

Thanks @gregnr for all the help & the nice debugging-tandem, I certainly enjoyed it πŸ™Œ

gregnr commented 1 month ago

Amazing work @BracketJohn. Thanks for sharing your final implementation πŸ™Œ

I'm in the middle of implementing more robust connection handling logic (closing each side of the stream at the correct time, exposing close events you can hook into via promise, etc) which hopefully solves the above issue since it sounds related. This is likely the last piece before 0.3.0 is released. Will keep you updated.

gregnr commented 1 month ago

@AndresGnu if I understand your use case correctly, it's worth reiterating that PGlite only operates on a single connection (single-user mode), so sending messages from multiple connections to the same PGlite instance will have undefined behaviour. For example:

At a minimum I would suggest tracking extended queries and queue them between connections so that they're atomic, and also tracking and queuing transactions between connections so that they don't overlap. Though there are no guarantees these 2 things alone will make this work.

Future versions of pg-gateway will support onQuery hooks and possibly transaction monitoring that might make these things easier to track.

BracketJohn commented 1 month ago

@gregnr happy to share & thanks again to you for being along on the ride plus working the pg-gateway side of things. We've now been using pg-gateway + pglite for local development and it is mostly working fine for us. There's some hiccups, but I want to oberserve further before documenting them here.

Getting rid of Error: write EPIPE would definitely be nice for us, it was one of the main feedback points by devs & PMs: They were confused on whether this is an actionable problem or something to be ignored.