Closed OultimoCoder closed 1 year ago
I found this upgrading from 1.1 to 1.3
@OultimoCoder this used to work for you with version 1.1? I'm not seeing anything in the diff that would directly change the behavior of transactions: https://github.com/depot/kysely-planetscale/compare/v1.1.0...v1.3.0.
Did you upgrade anything else at the same time (kysely
or @planetscale/database
)? It still could be something here, but if not, the other thing to check would be if another request was holding a lock that conflicted with the DELETE
, I'm not super familiar with what Vitess locks on delete.
Yeah 1.2 works fine but going to 1.3 breaks it, nothing else changes. I tested just now only incrementing this package version until it broke.
Interesting - are you setting the useSharedConnection
config option? That's the thing that 1.3 added, but it's supposed to be disabled by default.
I haven't made any changes from 1.1, just upgraded the package to 1.3.
Could it be related to the changes made that fixed this other issue I reported with transactions in 1.3 #17
Well v1.2.0 should have that fix included, so if you're not able to reproduce on 1.2, then it seems unrelated to that.
import { Kysely } from 'kysely'
import { PlanetScaleDialect } from 'kysely-planetscale'
import { ApiKeyTable } from '../models/apiKey.model'
import { AuthProviderTable } from '../models/authProvider.model'
import { EntriesTable } from '../models/entry.model'
import { FieldSourcesTable } from '../models/fieldSource.model'
import { SentencesTable } from '../models/sentence.model'
import { SourcesTable } from '../models/source.model'
import { UserTable } from '../models/user.model'
import { Config } from './config'
let dbClient: Kysely<Database>
export interface Database {
user: UserTable,
authorisations: AuthProviderTable
api_keys: ApiKeyTable,
sentences: SentencesTable,
entries: EntriesTable,
sources: SourcesTable,
field_sources: FieldSourcesTable
}
export const getDBClient = (databaseConfig: Config['database']) => {
dbClient =
dbClient ||
new Kysely<Database>({
dialect: new PlanetScaleDialect({
username: databaseConfig.username,
password: databaseConfig.password,
host: databaseConfig.host
})
})
return dbClient
}
This is how my client is instantiated, I have done 0 code changes when testing upgrades.
And I did a repeat test just to check, 100% something introduced between 1.2 and 1.3 which causes this error.
Oh actually 1.2.1 included the transaction fixes from #17, 1.2.0 did not. Did you try it with 1.2.1, or just 1.2.0?
Fails with 1.2.1!
I actually had another project that was using this package and I created a support ticket on planetscale as two queries at the same time were exceeding the transaction deadline of 20 seconds and they were doing like 1 read and 2 updates in the transaction and averaged a second total and I was very confused.
I would bet its the exact same issue as I was running version 1.3.
Yeah this is interesting, it's almost as if the rollback
is not called? 1.2.1 updated the logic to await
the rollback (and everything else), but I'm not sure how that would cause the issue you're seeing.
@koskimas do you see anything I'm doing incorrectly here? v1.2.0...v1.2.1
#diff-a2a171449d. I'm not super familiar with what Kysely internals do when an error is thrown inside a transaction.
@OultimoCoder @jacobwgillespie I was able to replicate this issue but only when I put delete statement into catch block within the transaction scope, but not like it's in the original code.
Now haven't find the root cause for this 20 sec query execution time so far, but there is something what could help @OultimoCoder to mitigate this problem.
Basically, I found that v1.2.0
doesn't perform rollbacks in case of an error within the transaction - I guess that's why in the given code there is a delete statement in case of the error.
Starting with v1.2.1
this seems to be fixed, and rollbacks are performed - inserts aren't committed.
So with rollbacks truly supported now, @OultimoCoder you should be able to get rid of delete statement in catch block and have rollback do it for you.
I originally found this when running tests using jest and a delete test would run after a test that intentionally errored in the catch block.
The delete code in the catch block is just an easy way of replicating it. I only put it there when I want to replicate the issue.
My issue is the 20 second timeout with the locks not being released after an error in the catch block.
Any update on this? :(
I haven't had a chance to investigate further yet - the thing that was fixed in v1.2.1
(#10) was that transaction commands (BEGIN
, COMMIT
, ROLLBACK
) weren't actually being awaited, so they could fail silently.
What I don't know here is if there's an issue inside kysely-planetscale
or @planetscale/database
that's preventing the normal transaction semantics from working, or if there's a legitimate lock contention in PlanetScale / Vitess that is only now showing up for you because previously it would have silently failed.
Does the demo code not prove there is no legitimate lock contention?
A failed transaction followed by a delete should always work, no?
I might have an investigate into this too then see if I can find out anymore.
A failed transaction followed by a delete should always work, no?
Possibly - how much traffic is this app getting? If you're getting this with a single request, then I'd expect it to work, but if there's concurrent requests / queries then it gets more complicated.
Tested locally. Repeatable and reproducible with a single request or integration test. 0 concurrency. About 75% of the time it times out, other 25% it completes in about 18-19 seconds.
In a separate app, I had the same issue in prod if there were 2 requests at the same time and one transaction failed.
The demo code was just added (I never normally delete in the try catch) to make it reproducible with one request.
Ill write a reproducible script for this.
@jacobwgillespie hey!
I have a repo here which reproduces the issue pretty consistently. Very minimal code with a quick readme on how to run it (you just need a fresh planetscale db/branch).
@OultimoCoder thank you for that reproduction, that's super helpful - I ran it with quite a bit of logging inside the kysely-planetscale
code, and it appeared to be issuing the expected queries, it correctly created a separate connection for the transaction, etc.
I adapted it to use @planetscale/database
directly, with raw SQL queries, and it's still timing out due to a lock. You can see the updated code here: https://github.com/jacobwgillespie/kysely-planetscale-issue-20.
So, I think this is either an issue inside @planetscale/database
, or some misunderstanding of how locks are working with PlanetScale. I would assume the former, since it feels like a single transaction, with ROLLBACK
shouldn't affect anything like you said.
Thanks so much for looking into this! Should I raise this issue on the @planetscale/database
repo and point to both of our codebases?
I think that makes sense, yeah, they're pretty responsive over there 👍
This was fixed in @planetscale/database
v1.10.0 🎉
https://github.com/planetscale/database-js/releases/tag/v1.10.0
I am using the latest version.
Here is my code:
I added the delete statement in the try catch purely to handle an error I am having. If I throw an error during my transaction it should rollback the transaction and release the locks but I instead get this error intermittently (the other times it completes in under 20 seconds):
This is the sole test I am running and the sole code executing.