prisma / extension-read-replicas

Prisma Client Extension to add read replica support to your Prisma Client
Apache License 2.0
101 stars 6 forks source link

Fix raw query support #18

Closed casey-chow closed 10 months ago

casey-chow commented 1 year ago

I had to experiment and debug this a bunch for my own client extension and these are the workarounds I found useful in light of https://github.com/prisma/prisma/issues/21139, as well as the incorrect operation names and shapes for raw queries.

Closes #14.

crice88 commented 1 year ago

Thanks for the hard work! We have gotten bit by this as we've had to implement a raw query for one of our recursive queries. Hoping this will direct it to where it needs to go!

mayur-2345 commented 1 year ago

thanks for opening PR for this. We are also waiting for this to release. We want to divert our raw select queries to read replica. thanks 👍

janpio commented 1 year ago

Right now it looks like we would not merge this, see the two questions above. If this also redirects potential write queries, that is a source for problems that we have to manage somehow.

notsoluckycharm commented 1 year ago

Right now it looks like we would not merge this, see the two questions above. If this also redirects potential write queries, that is a source for problems that we have to manage somehow.

Well, wouldn't the advice be to do prisma.$primary.$rawQuery ? You're never going to be able to parse the sheer volume of raw sql because you can do insert into (fields) values (select * from table)...

basically, it should be made clear in the documentation and up to the user to use the tool appropriately, just like using raw SQL puts far more control into the developers hands. So should this.

janpio commented 1 year ago

That is exactly the problem this extension is supposed to solve: You should not have to manage 2 "entry points" to your database and then manually decide which one to use, msot of the time. Right now raw queries are always safe as they are run on the primary. If we change this, it's a breaking change for existing usage and afterwards any time you run a raw query via it you have to make that decision which of the "entry points" to use.

casey-chow commented 1 year ago

That is exactly the problem this extension is supposed to solve: You should not have to manage 2 "entry points" to your database and then manually decide which one to use, msot of the time.

Ideally you shouldn't, but as a developer I wouldn't necessarily trust Prisma to make that call on raw queries. Which queries go where is very obvious right now. If you look at PgPool's list it's pretty long, and even if the list were implemented fully faithfully I'd still be very careful with raw queries and want to be explicit about what goes to primary vs. replica.

Right now raw queries are always safe as they are run on the primary. If we change this, it's a breaking change for existing usage and afterwards any time you run a raw query via it you have to make that decision which of the "entry points" to use.

They're always safe right now, but reading the source code, this comes off as a bug rather than an intentional behavior. I'll generally use $queryRaw for compute-intensive queries like username search, and then only occasionally I'll use it for an advanced raw use case (like an INSERT ... RETURNING), so having raw queries only hit the primary is fairly purpose-defeating without the inverse escape hatch.

janpio commented 1 year ago

They're always safe right now, but reading the source code, this comes off as a bug rather than an intentional behavior.

That's just an oversight of cleaning up, not a bug. Otherwise this big gotcha and potential foot gun would be documented.

I'd still be very careful with raw queries and want to be explicit about what goes to primary vs. replica.

Isn't the right approach here then to add a $replicas() (or $replica()?) method that forces queries to one of the replicas? Then you can be explicit for the queries that should go to a replica instead of the primary (or letting any internal logic decide).

casey-chow commented 1 year ago

Isn't the right approach here then to add a $replicas() (or $replica()?) method that forces queries to one of the replicas? Then you can be explicit for the queries that should go to a replica instead of the primary (or letting any internal logic decide).

In terms of the API design I see three possible options:

  1. Defaulting raw queries to primary, introducing a $replica() method. No correctness footgun, but definitely a performance footgun as developers may assume SELECT queries always go to replicas.
  2. Defaulting raw queries to replica. No performance footgun, but unexpected errors may occur, as developers may expect the extension to route writes to primaries.
  3. Force developers to explicitly specify whether they want a raw query to hit the primary replica. Given my understanding of client extensions, we could only implement by erroring and not by erasing the method. I think we all agree this would be the worst DX.
  4. Parse the SQL generated to determine if it's safe to run on a replica. This is the best DX by far, but is unrealistic for the scope of this extension.

The way I see it, it's not a trade off between having a footgun and not, it's a trade off between a performance footgun and a correctness footgun. My general philosophy is to allow these things to fail fast rather than silently work in unexpected ways. I especially worry about developers putting heavy queries on primary thinking they're going on the replica, then suddenly hitting a performance wall and not understanding where all the compute went.

notsoluckycharm commented 1 year ago

What about a plug-in configuration value that changes this behavior? You could default to the primary, but have an override option of, say, "rawDefaultReplica:true" (default false) or however you'd like to do the naming convention. Would this work? This kind of functionality is critical to my own use case, I have 5x the hand written queries (huge analytics platform) than I do CRUD operations.

casey-chow commented 11 months ago

@janpio @millsp I updated the PR to remove the change in behavior with $queryRaw in favor of proposing we merge this as well as #21 in tandem for now. The more important thing to fix is that raw queries cannot hit replicas right now at all, regardless of what the default behavior is or should be, so I'm going to focus on pushing this proposed course of action through instead. Please take a look!

SevInf commented 10 months ago

21 have been merged and at least for time being, that would be a recommended way of dealing with raw queries. Closing this for now, thank you, @casey-chow, for both of those contributions.