prisma / prisma

Next-generation ORM for Node.js & TypeScript | PostgreSQL, MySQL, MariaDB, SQL Server, SQLite, MongoDB and CockroachDB
https://www.prisma.io
Apache License 2.0
38.8k stars 1.52k forks source link

`upsertMany()` functionality for bulk upsert #10362

Open Ezard opened 2 years ago

Ezard commented 2 years ago

Problem

Upserting many records is inefficient. Currently, calling x upsert statements (e.g. as part of a single transaction) results in x SELECT statements followed by y INSERT statements and z UPDATE statements, where y + z = x.

Suggested solution

A more efficient flow would be to do a single bulk SELECT statement, followed by a bulk INSERT statement, followed by however many UPDATE statements are required.

e.g. to upsert 100 records, 50 of which already exist in the DB, Prisma will currently run a total of 100 SELECT statements, 50 INSERT statements and 50 UPDATE statements. With the described solution flow, this could be reduced to 1 SELECT statement, 1 INSERT statement and 50 UPDATE statements. This brings the total statement count down from 200 to 52.

Alternatives

An alternative would be for a developer to do this manually in each project that uses Prisma

Ezard commented 2 years ago

Just discovered that Postgres has the ability to update multiple rows with different values (using the UPDATE...FROM functionality), so this would drop the total number of statements down from 200 to 3 (obviously other databases that don't support this functionality would need to use the less-efficient solution that I originally described)

Akxe commented 2 years ago

@Ezard MySQL supports this too

zipzapzanigan commented 2 years ago

would love this feature. have to do it manually right now

ADanilychevJr commented 2 years ago

Would love this as well

ChrisCates commented 1 year ago

What do you guys need to actually make this happen?

taylor-lindores-reeves commented 1 year ago

What are the current solutions / workarounds?

Akxe commented 1 year ago

Not much... you have to write the SQL manually

ajhollowayvrm commented 1 year ago

Another bump on this issue. Currently doing raw multiple inserts with on conflict do update set

taylor-lindores-reeves commented 1 year ago

I got around this with a nested write. Both create and update within .upsert() call.

    return this.prisma.classified.upsert({
      where: { stockId: extendedStockId },
      update: {
        title,
        price,
        poa,
        status: classifiedStatusEnumMapper(status),
        liveDateTime: status === ClassifiedStatus.Live ? new Date() : null,
        Vehicle: {
          update: {
            ...vehicle,
            status: vehicleStatusEnumMapper(status),
            thumbnail: vehicle.mainImages[0],
          },
        },
      },
      create: {
        stockId: extendedStockId,
        title,
        price,
        poa,
        slug,
        uniqueId,
        classifiedType: ItemType.Vehicle,
        liveDateTime: status === ClassifiedStatus.Live ? new Date() : null,
        status: classifiedStatusEnumMapper(status),
        User: {
          connect: {
            uniqueId: dealerId,
          },
        },
        Vehicle: {
          create: {
            ...vehicle,
            uniqueId: generateUniqueId(year),
            status: vehicleStatusEnumMapper(status),
            thumbnail: mainImages[0],
            User: {
              connect: {
                uniqueId: dealerId,
              },
            },
          },
        },
      },
      include: {
        Vehicle: true,
        User: {
          select: {
            companyName: true,
            sellerType: true,
            slug: true,
          },
        },
      },
    });
ChrisCates commented 1 year ago

@leafyshark this implementation wouldn't work for me.

What's frustrating is the complete lack of responsiveness from the Prisma team. We all know they aren't interested in our PRs and will make us jump hoops, to then end up discarding the external PR. And they also don't seem to care for a few thousand USD if any of us try to throw a bone.

For context of how much time this code would save. Look at this Postgres query I had to write (with JS Interpolation).

        INSERT INTO public."Orderbook"
            (${keys.map((key) => `"${key}"`).join(',')})
        VALUES
            ${orders.map((order) => `
                (${keys.map((key) => {
                    if (typeof order[key] === 'string') {
                        return `"${order[key]}"`
                    } else {
                        return `${order[key]}`
                    }
                }).join(',')})
            `).join(',')}
        ON CONFLICT ("id") DO UPDATE SET
            "status" = EXCLUDED."status"

Individual transactions even with phantom writes. Is insanely unscalable and only works for 10,000 records max at a time. I am seriously considering moving back to Knex if the Prisma team doesn't come up with a proper response or at least is open to payment or PRs.

jhnns commented 1 year ago

Until this is implemented in Prisma, you can combine knex + $executeRawUnsafe to achieve it. Of course, this needs to be supported by your DB:

import * as knexModule from "knex";

const knex = knexModule.default.knex({
  // No connection settings necessary here because we only use the QueryBuilder
  client: "pg", // Using Postgres
});

const date = new Date().toISOString();

prismaClient.$executeRawUnsafe(
  knex("Client")
    .insert(
      clients.map((client): Prisma.ClientCreateManyInput => {
        return {
          ...client,
          createdAt: date,
          updatedAt: date,
        };
      })
    )
    .onConflict<keyof Prisma.ClientCreateManyInput>("number")
    .merge(["handle", "name", "archived", "updatedAt"] satisfies Array<
      keyof Prisma.ClientUpdateManyMutationInput
    >)
    .toString()
  );

It can be a little bit challenging to get the same type safety because Knex's types seem to be a little bit buggy. I decided to not add types to Knex itself, but to validate the arguments before passing them to Knex.

janpio commented 4 months ago

Related/duplicate: https://github.com/prisma/prisma/issues/4134