planetscale / database-js

A Fetch API-compatible PlanetScale database driver
https://planetscale.com/docs/tutorials/planetscale-serverless-driver
Apache License 2.0
1.17k stars 35 forks source link

Add support for transactions #58

Closed iheanyi closed 2 years ago

iheanyi commented 2 years ago

This pull request adds support for transactions within the driver. Callers can use this for safely executing transactions that get rolled back after exceptions.

Closes https://github.com/planetscale/database-js/issues/54

sekoyo commented 1 year ago

The examples and tests show statements in a transaction running in series, but can we also run in parallel? e.g.

await conn.transaction(tx => Promise.all([
  tx.execute(...),
  tx.execute(...),
]))
mattrobenolt commented 1 year ago

While you technically can through the driver, I would highly advise not to because it would become nondeterministic from the POV of the caller and would defeat the purpose of most transactional safety. One big property of transactions is being linearizable, in which case, we do liberalize against the underlying mysqld. So an attempt at doing this in parallel still results in them being serialized against the server. Just now in a nondetermistic order.

What are you actually trying to accomplish though by wanted to parallelize within a single transaction? It sounds to me, like you just don't want a transaction or you want multiple parallel transactions, both of which I'd consider safe.

sekoyo commented 1 year ago

While you technically can through the driver, I would highly advise not to because it would become nondeterministic from the POV of the caller and would defeat the purpose of most transactional safety. One big property of transactions is being linearizable, in which case, we do liberalize against the underlying mysqld. So an attempt at doing this in parallel still results in them being serialized against the server. Just now in a nondetermistic order.

What are you actually trying to accomplish though by wanted to parallelize within a single transaction? It sounds to me, like you just don't want a transaction or you want multiple parallel transactions, both of which I'd consider safe.

Thanks for the prompt reply. Makes sense. I was just wondering if instead of creating a users first, then entries into tables like federated_credentials and sessions after, I could generate a userID and create/update them at once. But seems like there is no advantage in doing so, so I will just do it in series.

mattrobenolt commented 1 year ago

@DominicTobias I think the ideal solution for you there is us adding multi-statement support, what I'll probably be calling pipelined request. https://github.com/planetscale/database-js/issues/85

But this needs support on the actual API and server side first, but the benefit being you'd save the round trip of doing two queries serially. Since you don't really want them in parallel, you just want to avoid the round trip latency from the network. So us getting multi-statement support would accomplish that.

sekoyo commented 1 year ago

@DominicTobias I think the ideal solution for you there is us adding multi-statement support, what I'll probably be calling pipelined request. #85

But this needs support on the actual API and server side first, but the benefit being you'd save the round trip of doing two queries serially. Since you don't really want them in parallel, you just want to avoid the round trip latency from the network. So us getting multi-statement support would accomplish that.

Ah yeah makes sense would help with the latency (in this case 3 round trips), and maybe also the edge function cost

mattrobenolt commented 1 year ago

Technically 4 round trips, since begin; and commit; are implied from the transaction. :)

I'd expect to have an API such that you can do like:

execute(['begin', 'select 1', 'select 1', 'commit'])

Then ytou'd be able to do the whole "start transaction, run N queries in the transaction, and commit" all in one batch/pipeline request. I don't know how this will bubble up into the JS API, but any of the transaction conveniences here simply are sugar over executing begin and commit. We'll get there. :)

Izhaki commented 1 year ago

Just to chime in here with a use case (and I have seen https://github.com/planetscale/database-js/issues/85):

Consider a case like StackOverflow, where you wish to add a question and it has multiple tags; you want:

All within one transaction (because if any fails, the data will be in inconsistent state).

mattrobenolt commented 1 year ago

Yeah, I understand the use case, but you wouldn't gain anything by executing the queries in parallel vs executing them in a pipeline. They need to get pipelined to mysqld anyways, and can't be done concurrently within a transaction. This is a fundamental limitation of mysql.

mattrobenolt commented 1 year ago

Even if we supported this somehow, we'd have to serialize this down to a lock and execute sequentially against the underlying mysqld process anyways, the benefit of this being, your queries are now happening in a non-deterministic order, which is generally not favorable within a transaction. But this is non-trivial to do on the API side, and much easier to just prevent this behavior in the client, which is why I'm curious if we can prevent the concurrency in JS.

Typically, in a real language, I'd use a mutex or something, so not sure what's available here to prevent it.

iheanyi commented 1 year ago

Correct to what @mattrobenolt said. Think about it like this. A transaction is a lock that exists. You can't have multiple operations going against this lock, there will be contention. Therefore, every operation needs to be performed serially in some way, shape, or form.

Izhaki commented 1 year ago

I'm all good. Thank you so much for the detailed explanation!