brafdlog commented 5 years ago

Links

This issue is a place to discuss and document use cases related to database transactions. It is being linked from this docs page.

The following feature requests relate to issues discussed here:

Transactional Batch API: https://github.com/prisma/prisma-client-js/issues/667
Atomic Operations: https://github.com/prisma/prisma-client-js/issues/655
Long running transactions: https://github.com/prisma/prisma/issues/1844

Original Issue

Currently (to the best of my knowledge) there is no way to control the transactions of the different operations we make. When working with a relational database, transactions are a basic feature that is widely used. Allowing running multiple operations in the same transaction is vital in many use cases and could prevent people from using prisma in cases where it is important. Two options to support this:

A bulk operation api that allows sending multiple operations in one call and running all of them in the same transaction.
An alternative more intrusive but powerful option would be to expose the transaction concept to the api - allow starting a transaction explicitly, getting a transaction id, running multiple operations using that transaction id, and explicitly committing the transaction.

I think that since prisma does not support only databases that work with transactions, the second option would result in an api that is not as clean. But it is worth considering.

zzyyzz1992 commented 5 years ago

For example, when we creating a online order from users and its corresponding orderItems, if somehow an error occurred when creating one of the orderItems, the whole operation should be revoked. Suggested API example:

        const orderRawData = ...;
        const orderItemRawDatas = [...];
        //Isolation level: SERIALIZABLE
        await photon.transaction("SERIALIZABLE", async photonWithTransaction => {
            const order = await photonWithTransaction.orderTable.create({/* using orderRawData */});
            const orderItems = await Promise.all(orderItemRawDatas.map( orderItemRawData => {
                                    return photonWithTransaction.create({/* using orderItemRawData& order */})
                                })
                               );
            return { order, orderItems };
        }).then(result=>{
            // handling results...    
        }).onError(err=>{//Errors including custom rollback
            // handling errors...    
        });

janpio commented 5 years ago

Note: @divyenduz will talk to @schickling about that

divyenduz commented 5 years ago

Thank you for raising this 🙌

For context, this is the WIP spec for Photon API.

We are thinking (not finalized, requesting comments 🙏) of the following in terms of the transaction API:

Single Photon operation (if it produces multiple SQL queries) is always a transaction, irrespective of the final API, however, we are aware that transactions are a heavy and we consider a global configuration (maybe in Prisma schema file) to make this configurable, defaulting to transaction.

await photon.users
  .findMany()
  .updateMany({ email: u => u.email.set('bob@gmail.com') })

To perform a database transaction across multiple Photon calls, we can use the Photon's batch API. A batch may or may not be a transaction based on the options used with a batch call.

// Batching, don't get the results with $noData
const m1 = photon.user.create({ firstName: 'Alice' })
const m2 = photon.post.create({ title: 'Hello world' })
const [u1, p1]: [boolean, boolean] = await photon.batch([m1, m2])

// TODO: `if` API

// Batching with transaction
await photon.batch([m1, m2], { transaction: true })

Please let us know what you think of these? Happy to discuss this further.

zzyyzz1992 commented 5 years ago

hi @divyenduz, I think batching isn't a qualified solution, because these operations are contextually related In my case above, there are two steps: step 1. creating a order ( the order model has a field named id ) step 2. creating a orderItem ( the orderItem model has a field named order_id which pointing to order's id ) which order_id is just created in step 1 and I can't get in step2 when using batching.

These can be done with the kinda heavy single photon operation mentioned above.

nsignes-harmonicinc commented 5 years ago

Hi all, Just chiming in : being able to ensure clean reversal and isolation of actions requiring chained queries would be incredibly useful to get. Having to create 1+ rows and then use the newly created IDs in some other query as a single atomic business action happen frequently in my field too.

Thanks !

divyenduz commented 5 years ago

Thanks for all the feedback so far 🙌

We have been thinking internally about transactions and how that could look like from Photon.

Long running transactions are bad for overall performance of the system because the database has to track all the objects in the long running transactions and there are a lot of locks that make the system slow.

As the systems grow different strategies (like providing an id from the application layer or using a broker to ensure updates) need to be used to circumvent this.

We want to support most of the use cases with Photon and also make what we believe are "anti-patterns" of transactions not possible in the API.

We welcome feedback from the community on this 🙌

jgoux commented 4 years ago

I think that at some point you can't ignore the underlying database, especially for things as important as transactions.

In my day to day development, I don't see myself using the current batch API, as I often use transactions to wrap business logic checks or third-party API / librairies calls.

The most useful abstraction IMHO is the bracket approach : https://github.com/gajus/slonik#protecting-against-unsafe-transaction-handling where a callback is wrapped inside a transactional context, with a provided client tied to the specific transaction.

I think it also touches this issue as we could also rely on the database specific library (in my case pg) to externally handle the transaction and just pass the connection to photon client. This way keeping the API similar for every database provider would be possible as it's all handled externally.

perrosnk commented 4 years ago

Is there a timeline for a stronger support of transactions (at least what @divyenduz mentioned)? We are currently evaluating prisma and everything looks great apart from transactions support.

robjinman commented 4 years ago

First off, I love Prisma, it's a great tool, and you guys are doing an amazing job. However, I desperately wish I knew prisma lacked transaction support from the outset. For me, it would have been a deal breaker. It's an absolutely essential feature for any complex real world application. For example, how do you update a value atomically? This is a very basic requirement.

What are our options? The app I'm building deals with money and needs to keep track of payments and account balances with very high reliability, so we definitely need atomic updates or transactions. What are people doing to get around this? Is the best option to just use raw SQL for those critical queries?

Currently I'm doing this really horrible hack, where I protect one of my endpoints with a mutex I made from a Promise.

async function doStuff(context: Context) {
  value = await context.prisma.thing({ id: "123" }).value();

  await context.prisma.updateThing({
    data: { value: value + 1 },
    where: { id: "123" }
  });
}

let lock: Promise<void>|undefined;

export async function myEndpoint(context Context) {
  while (lock) {
    await lock;
  }
  lock = new Promise<void>(async (resolve, reject) => {
    try {
      await doStuff(context);
    }
    catch(err) {
      lock = undefined;
      reject(err);
      return;
    }
    lock= undefined;
    resolve();
  });
}

This isn't a solution though. I have multiple apps accessing the same database, so there's still a race condition.

sorenbs commented 4 years ago

@jgoux,

What you describe as the bracket approach is similar to the using statement in C#, and a very ergonomic API to use when you need to clean up resources after use. This is a feature request for adding exactly this to the Prisma Client API: https://github.com/prisma/prisma-client-js/issues/667

Could you elaborate on what you mean with the following:

In my day to day development, I don't see myself using the current batch API, as I often use transactions to wrap business logic checks or third-party API / librairies calls.

It sounds to me like you are describing a situation where you are doing a database operation, then doing a non-database operation (like sending an email or calling an API), and then do a final database operation? It would be helpful if you could describe some use cases in more detail.

sorenbs commented 4 years ago

@RobertConsidine,

I believe this feature request for atomic operations would address your use case: https://github.com/prisma/prisma-client-js/issues/655

Does it fully address what you are looking for?

sorenbs commented 4 years ago

@zzyyzz1992,

I think batching isn't a qualified solution, because these operations are contextually related In my case above, there are two steps: step 1. creating a order ( the order model has a field named id ) step 2. creating a orderItem ( the orderItem model has a field named order_id which pointing to order's id ) which order_id is just created in step 1 and I can't get in step2 when using batching.

The best way to do this is to use the nested api, which will make sure the correct id is retrieved and used for the second step, as well as wrap the entire operation in a transaction.

I believe this covers your use case fully, but let me know if there is some edge case where you need a more powerful api. Thanks!

robjinman commented 4 years ago

@RobertConsidine,

I believe this feature request for atomic operations would address your use case: prisma/prisma-client-js#655

Does it fully address what you are looking for?

Yes, thanks, though I'll need proper transaction support too (#667?). There are some cases where I really need to run a batch of queries, not a nested query.

Sytten commented 4 years ago

I just added a case yesterday that is brittle due to no transaction. Creation of objects is usually fine with the nested syntax, but its the update of multiple records that is not yet well covered. Thinking about how prisma is currently designed with a client/server, I believe a first good step would be able to create meta queries which would send all operations in one Graphql request. Similar to https://github.com/prisma/prisma-client-js/issues/349#issuecomment-529959988
Long running transactions with a callback will most likely have to wait for Neon direct bindings.

airhorns commented 4 years ago

I think a powerful use-case for transactions that can't be modelled as batch writes or nested API calls is transactions around each test case that are ROLLBACK'd at the end of the test. This lets developers write tests quickly that mutate database state in order to set up the right context for testing while remaining performant to undo those mutations to restore a fresh, reset state for the next test. This is called Transactional Tests in the Rails world.

More generally, actual transaction support is necessary for any use case where you need to read what you wrote, but aren't sure you need to commit. If you use the database's random function, or sequence generators, or stored procedures, or anything like that, you can't know the entire set of queries and binds you want to execute ahead of time, you have to read what you wrote, but you still might not want to COMMIT. I feel like the best database connectivity tool I would want to use would make simple, frequent stuff easy like nested inserts, but not make the rare stuff impossible, like using real transactions.

wanseob commented 4 years ago

Enjoying prisma a lot with typescript. But the lack of transaction support is extremely critical to me. A merkle tree management needs 256 number of operations at once and now it takes about 3 seconds to update an item. What I want is pretty simple :)

await prisma.mytable.upsert.batch([
    { where: {}, update: {}, create: {}},
    { where: {}, update: {}, create: {}},
    { where: {}, update: {}, create: {}},
])

ie., needs upsert.batch or upsertMany not only createMany. Or prisma/prisma-client-js#667 is also enough for me and looks good.

I hope this feature can be loaded on beta soon. Thanks for your hard works on the maintenance of the project

janpio commented 4 years ago

In a related issue, a comment with a possible transaction API was posted and is looking for feedback (or 👍 on the comment): https://github.com/prisma/prisma-client-js/issues/667#issuecomment-643750802

chadjaros commented 4 years ago

Prisma looks like it matches my ideal DB management solution

Define schema using SQL migration files
Generate a typesafe client from the installed db schema
Utilize compiler checks to make sure no code is left behind after a schema change

However, lack of exposed, flexible transaction support makes this a non-starter.

snake575 commented 4 years ago

Hi, I just realized that 2.2.0 added Transaction as a reserved keyword so my Transaction model is broken. Is there any chance to rename the new prisma.transaction API to something else less common? I guess that a lot of apps dealing with money would have a Transaction model.

ghost commented 4 years ago

Please also consider the use case where a large amount of data, too much to hold in memory, is written into the database as a transaction. With raw SQL, one can easily BEGIN a transaction, read data from a file line-by-line, generate an INSERT statement for each line, then COMMIT.

This could be achieved with a bulk operation API using iterables. However, that removes the ability to group multiple INSERTs together. There is a balance here between the overhead of many individual insert statements, and the memory usage (and delay) caused by combining too many inserts into a single SQL statement. So, in pseudo code my requirement is something like this:

execute('BEGIN');
for (let i = 0; i < 1000; i++) {
  let insert = "INSERT INTO table (i, j) VALUES";
  for (let j = 0; j < 1000; j++) {
    insert += ` (${i}, ${j})`;
  }
  execute(insert);
}
execute('COMMIT');

hmiiro commented 4 years ago

Hi, I just realized that 2.2.0 added Transaction as a reserved keyword so my Transaction model is broken. Is there any chance to rename the new prisma.transaction API to something else less common? I guess that a lot of apps dealing with money would have a Transaction model.

Try current release...think its 2.5.1? this was addressed already with $transaction in 2.4.0... https://github.com/prisma/prisma/releases/tag/2.4.0

thekevinbrown commented 4 years ago

For us the use case is we need to be able to read data, make decisions based on that data, then decide what to write. This needs to happen in a particular isolation level. Consider something like a purchase:

Read Account Balance
If greater than or equal to cost of item
- Add debit to account
- Decrement account balance
- Grant access to item
Else return 'insufficient balance'

Without being able to control a transaction start (with a Repeatable Read or Serializable isolation level) prior to the read, we are opening ourselves up to double spends by users hitting the endpoint very quickly, because they can interleave the reads to happen before the writes.

With most database abstraction layer tools this is a thing we can do (by just grabbing the connection and issuing a begin transaction, then set isolation level, regardless of support in the tool itself for this), but not with Prisma yet, and there's no workaround we can find to get ahold of the raw connection to do this ourselves either.

This is a major gap that's keeping us from adopting Prisma.

airhorns commented 4 years ago

Is there any guidance we could get from the Prisma team if this is a feature you folks are considering adding, or is the current $transaction support as far as you're going to take it? I feel like some of us might adopt Prisma now knowing transaction support is coming in the future, or might not adopt knowing it's not coming ever or for a very long time.

thebiglabasky commented 4 years ago

Hi sorry for the delay here. We're looking to add support for implementing Optimistic Concurrency Control patterns this quarter.

There's a general post giving a bit more details here and on our blog.

We're heads down in stabilizing migrate which takes quite some effort off advancing some of these topics, but we are looking at having solutions to problems like what @thekevinbrown mentioned.

thekevinbrown commented 4 years ago

I’ve read that proposal and don’t see how I could make that work across our use cases.

Why does Prisma not just give us control over what connection the statements run on and let the database do what it does well already? I’m really confused why a whole new layer needs to be built for performance when support for the layer that’s been there for decades hasn’t been implemented yet.

hugbubby commented 3 years ago

I have to believe the conversation here from Prisma devs is masking some internal underlying technical issue that they have yet to solve, and that their team doesn't believe what they're actually saying in this thread and on their blog. Otherwise, they have gone insane and decided to willingly make an ORM that, for all practical purposes, cannot be used to safely read and write from databases. The suggestion that your users ought to reinvent database transactions on an application level, or replace them with shared mutexes, is patently crazy, and I wish you would instead be honest with us. It'd be funny to read the pretzel logic involved in defending this position if it weren't so infuriating.

sorenbs commented 3 years ago

@thekevinbrown,

For us the use case is we need to be able to read data, make decisions based on that data, then decide what to write. This needs to happen in a particular isolation level. Consider something like a purchase:

If you can encode the "make decisions" part in SQL, then this is already possible with prisma.executeRaw. We want to enable this use case without having to write raw SQL by enabling you to reference values from previous statements in a transaction. It will be a long time before we get to this.

If the "make decisions" part has to involve application code or even external service calls, then you can implement this with Prisma today using the Optimistic Concurrency pattern. The general idea is that you prepare the write but make it conditional of the underlaying data not having changed. Here is an example of how to do that with prisma.executeRaw. We plan to make this possible without having to writ raw sql as well. This is likely to happen much sooner.

I understand that this particular approach does not work in all cases, but it does enable you to implement your particular use case with Prisma today. I hope this helps.

thekevinbrown commented 3 years ago

@sorenbs, would technically work in some cases, but genuinely won't work for us at scale.

We've sometimes got a bunch of completely different writes that may or may not need to happen depending on all sorts of values across different functions that should work regardless of if they're called in a transaction context or not. We don't want to bind our business logic to isolation levels if possible. Doing it the way you're suggesting, we'd have to really tightly couple all of that business logic in SQL instead of being able to just go:

begin transaction
call function
  fetch some stuff
  if (condition) write stuff
fetch summary
if (another condition) write stuff
call function with args
  write stuff
commit transaction

And importantly have those functions continue to work outside of a need for a transaction as well. So basically, yes, in theory, but it'd make it much harder to maintain the code. I'll grant that it'd perform better your way, but most of these areas are not performance bottlenecks anyway, and speed of developers maintaining the code is more important than shaving a few ms off of an invocation on a non-bottleneck for us.

Sytten commented 3 years ago

@hugbubby Some of what you say has a sense of truth to it. Most people in this thread don't bother understanding how prisma (or DBMS for that matter) really work before starting to use the technology. Prisma 2 inherits from prisma 1 the client/server architecture. This means every time you make a request it actually first serializes to a graphql-ish requests, sends it to a socket (unless using napi direct bindings), interprets it, fetch a connection from the pool and then send the sql request (inverse path for the response). Implementing long running transactions in that context is very hard to do reliably. Do you do independent requests? If so, how do you link them together? What if the JS side never sends the commit, will you keep a connection stuck forever? If you put a timeout, how long is it? If you keep the http connnection alive, how do you send new requests?

@thekevinbrown I agree with you on code maintainability, I had to write plpgsql recently due to not having read, decide, write in prisma and it was painful. But I dont agree that long running transaction are not a bottleneck, they are because they keep a DB connection busy. Lets say your database has a 100 connection total (recommended for 4gb of RAM for postgres), if you have a simple api and a worker with 3 instances each you cant really give more than 10 connections per instance. If you use long running transactions with slow logic (ie making network calls) just a bit too much, it is very easy to exhaust your connection pool. This is mostly apparent in outages where the network calls could take several seconds causing a cascading effect on your whole service instead of just that one operation.

So in general prisma is right to push people to avoid long running transactions, but it should probably allow it too since not all databases are scriptable nor are people willing to write that logic in those sql like languages.

sorenbs commented 3 years ago

Thank you @thekevinbrown

There is some discussion going on about what an api for fully flexible transactions in Prisma Client could look like over in this other issue: https://github.com/prisma/prisma/issues/1844#issuecomment-842158100

I'd love to get your thoughts captured the as well.

thekevinbrown commented 3 years ago

@Sytten I agree that it takes up a connection for longer, and that is a bottleneck. What I'm saying is: Of the tradeoffs that we have, we'd rather take up a connection for longer so our business logic is easier to maintain and is decoupled from how the database works than couple it and make it less maintainable to solve a performance problem which doesn't cause us any pain today.

We don't currently have problems exhausting the connection pool using these long running transactions with another ORM, which is what I meant when I said "not a bottleneck for us".

doughoff commented 3 years ago

Hi, just to show you guys some use cases that are a little edgy. Can be done by using the nested write but can cause issues with some stock calculation that need to be performed.

I have a use case for my customers where they need to have stock separated by storage and readily available. A running average of stock price is also required.

So i have

Product(stock_price), stock (warehouse_id, amount, batch, stock_price). When an buy order is completed, i need to update the order, the product stock column and the stock table all at the same time. The problem is that without serializable isolation level i can end up with some nasty nasty problems.

This can all be done using nested writes, but isolation of operation is not enforced. So if two buy orders are processed proximate times, an they have some product in common, problems happen.

I was reading the blog post about transactions and "thinking in better design" but i cannot find something that will maintain this working. (Yeah, sure i can do all those data running_avg stocks and etc on a "Calculated" field, but this is not fast when you have too much products and orders).

If someone could point an error of my thinking, i am very happy to listen what i'm doing wrong and what could be improved.

Also i was thinking on creation of functions on database where those transactions would run, then call this from the client, but manage versioning of database code is really bad, if i would need to update 30 instances of the database (Clients databases are isolated) and some problems of procedures/functions versioning happen. man that's not fun.

edit1: Im not the best writer tldr; Isolation level is needed even when a nested write is possible.

matthewmueller commented 3 years ago

All of the features linked have been implemented. Closing!

We have a separate issue on setting the Isolation Level.

kasir-barati commented 2 years ago

Any thought on doing this @brafdlog: Returning $transaction to the user and pass it to the API interface as a argument. Something like this:

const transactionObj = prisma.createTransaction();
const user = await prisma.user.create({
  data: {
    email: 'elsa@prisma.io',
    name: 'Elsa Prisma',
  },
  { transactions: transactionObj }
})
const temp = prisma.temp.update({ where: { id: 1}, data: {f1: 1},  { transactions: transactionObj }})

prisma / prisma

Stronger support for transaction management #5002

Links

Original Issue