Closed casey-chow closed 10 months ago
Thanks for the hard work! We have gotten bit by this as we've had to implement a raw query for one of our recursive queries. Hoping this will direct it to where it needs to go!
thanks for opening PR for this. We are also waiting for this to release. We want to divert our raw select queries to read replica. thanks 👍
Right now it looks like we would not merge this, see the two questions above. If this also redirects potential write queries, that is a source for problems that we have to manage somehow.
Right now it looks like we would not merge this, see the two questions above. If this also redirects potential write queries, that is a source for problems that we have to manage somehow.
Well, wouldn't the advice be to do prisma.$primary.$rawQuery ? You're never going to be able to parse the sheer volume of raw sql because you can do insert into (fields) values (select * from table)...
basically, it should be made clear in the documentation and up to the user to use the tool appropriately, just like using raw SQL puts far more control into the developers hands. So should this.
That is exactly the problem this extension is supposed to solve: You should not have to manage 2 "entry points" to your database and then manually decide which one to use, msot of the time. Right now raw queries are always safe as they are run on the primary. If we change this, it's a breaking change for existing usage and afterwards any time you run a raw query via it you have to make that decision which of the "entry points" to use.
That is exactly the problem this extension is supposed to solve: You should not have to manage 2 "entry points" to your database and then manually decide which one to use, msot of the time.
Ideally you shouldn't, but as a developer I wouldn't necessarily trust Prisma to make that call on raw queries. Which queries go where is very obvious right now. If you look at PgPool's list it's pretty long, and even if the list were implemented fully faithfully I'd still be very careful with raw queries and want to be explicit about what goes to primary vs. replica.
Right now raw queries are always safe as they are run on the primary. If we change this, it's a breaking change for existing usage and afterwards any time you run a raw query via it you have to make that decision which of the "entry points" to use.
They're always safe right now, but reading the source code, this comes off as a bug rather than an intentional behavior. I'll generally use $queryRaw
for compute-intensive queries like username search, and then only occasionally I'll use it for an advanced raw use case (like an INSERT ... RETURNING
), so having raw queries only hit the primary is fairly purpose-defeating without the inverse escape hatch.
They're always safe right now, but reading the source code, this comes off as a bug rather than an intentional behavior.
That's just an oversight of cleaning up, not a bug. Otherwise this big gotcha and potential foot gun would be documented.
I'd still be very careful with raw queries and want to be explicit about what goes to primary vs. replica.
Isn't the right approach here then to add a $replicas()
(or $replica()
?) method that forces queries to one of the replicas? Then you can be explicit for the queries that should go to a replica instead of the primary (or letting any internal logic decide).
Isn't the right approach here then to add a $replicas() (or $replica()?) method that forces queries to one of the replicas? Then you can be explicit for the queries that should go to a replica instead of the primary (or letting any internal logic decide).
In terms of the API design I see three possible options:
$replica()
method. No correctness footgun, but definitely a performance footgun as developers may assume SELECT
queries always go to replicas. The way I see it, it's not a trade off between having a footgun and not, it's a trade off between a performance footgun and a correctness footgun. My general philosophy is to allow these things to fail fast rather than silently work in unexpected ways. I especially worry about developers putting heavy queries on primary thinking they're going on the replica, then suddenly hitting a performance wall and not understanding where all the compute went.
What about a plug-in configuration value that changes this behavior? You could default to the primary, but have an override option of, say, "rawDefaultReplica:true" (default false) or however you'd like to do the naming convention. Would this work? This kind of functionality is critical to my own use case, I have 5x the hand written queries (huge analytics platform) than I do CRUD operations.
@janpio @millsp I updated the PR to remove the change in behavior with $queryRaw
in favor of proposing we merge this as well as #21 in tandem for now. The more important thing to fix is that raw queries cannot hit replicas right now at all, regardless of what the default behavior is or should be, so I'm going to focus on pushing this proposed course of action through instead. Please take a look!
I had to experiment and debug this a bunch for my own client extension and these are the workarounds I found useful in light of https://github.com/prisma/prisma/issues/21139, as well as the incorrect operation names and shapes for raw queries.
Closes #14.