Merge and casting in dynamic queries

half-metal commented 1 year ago

I was attempting to verify I could do a merge query with 'postgres'

const insertedBabies = await sql`
MERGE INTO baby.babies b
 USING (VALUES ${sql(babies)}) as s (baby_id, birthdate, name)
 ON b.id = s.baby_id::int
 WHEN NOT MATCHED THEN
   INSERT (name, birthdate)
   VALUES (s.name, s.birthdate)
`

and tried several variations, and came across another issue, I can't merely cast birthdate like this s (baby_id, birthdate::date, name). Debug shows something like this


MERGE INTO baby.babies b
 USING (VALUES ($1,$2,$3)) as s (baby_id, birthdate, name)
 ON b.id = s.baby_id::int
 WHEN NOT MATCHED THEN
   INSERT (name, birthdate)
   VALUES (s.name, s.birthdate)
 [
  { baby_id: 68, birthdate: null, name: 'w1', role: 'admin' },
  { name: 'w2', birthdate: 2023-04-09T21:41:00.000Z },
  { name: 'w3', birthdate: 2023-04-09T21:41:00.000Z }
] [ 0, 0, 0 ]

So I guess two issues 1) How do you do multiple dynamic merge in postgres 2) How do you cast a certain group of columns doing an operation like this? Which I think relates to this issue https://github.com/porsager/postgres/issues/539

half-metal commented 1 year ago

I tried this

await sql`
MERGE INTO baby.babies AS b
USING (VALUES ${sql(babies,'id','birthdate','name')}) AS s (id,birthdate, name)
ON s.id::int = b.id::int
WHEN MATCHED THEN
    UPDATE SET birthdate = s.birthdate::date, name = s.name
WHEN NOT MATCHED THEN 
    INSERT ${sql(babies,'birthdate', 'name')}
  `
.values();

but that errors out at the comma before 'birthdate'

This below works, but has an issue

await sql`
MERGE INTO baby.babies AS b
USING (VALUES ${sql(babies,'id','birthdate','name')}) AS s (id,birthdate, name)
ON s.id::int = b.id::int
WHEN MATCHED THEN
    UPDATE SET birthdate = s.birthdate::date, name = s.name
WHEN NOT MATCHED THEN 
    INSERT (birthdate, name) VALUES (s.birthdate::date, s.name)
  `
.values();

since this translates to empty fields

MERGE INTO baby.babies AS b
USING (VALUES ($1,$2,$3)) AS s (id,birthdate, name)
ON s.id::int = b.id::int
WHEN MATCHED THEN
    UPDATE SET birthdate = s.birthdate::date, name = s.name
WHEN NOT MATCHED THEN 
        INSERT (birthdate, name) VALUES (s.birthdate::date, s.name)
   [ null, null, null ] [ 0, 0, 0 ]

Which is close! But I am guessing something in postgres has to translate this correctly.

porsager commented 1 year ago

I haven't used MERGE much yet, mostly because it doesn't support returning the modified rows :-(

Could you maybe use on conflict? Even so. Is this an issue with this library or just the query?

half-metal commented 1 year ago

@porsager I could use On Conflict, but I was reading there would be better performance in Merge for Postgres and also chance for data corruption with On Conflict, especially with several operations. And really, to me makes the statement more readable and standard SQL language.

This is what the fine internets told me:

MERGE and INSERT ON CONFLICT are both SQL statements that can be used to insert or update rows in a table, depending on whether the row already exists. However, there are some key differences between the two statements, which may make MERGE a better choice in certain situations.

MERGE supports more complex logic. MERGE can be used to insert or update rows based on multiple conditions, while INSERT ON CONFLICT can only be used to insert or update rows based on a single condition. For example, MERGE can be used to insert a new row if the row does not exist, or to update an existing row if the row does exist and the value of a certain column has changed.
MERGE is more efficient in some cases. MERGE can be more efficient than INSERT ON CONFLICT in some cases, especially if the table has a large number of rows. This is because MERGE only needs to scan the table once, while INSERT ON CONFLICT may need to scan the table twice (once to check for the existence of the row, and again to perform the update).
MERGE is more reliable in some cases. MERGE is more reliable than INSERT ON CONFLICT in some cases, especially if the table is being modified by multiple concurrent transactions. This is because MERGE uses MVCC (multiversion concurrency control), which ensures that each transaction sees a consistent view of the data. INSERT ON CONFLICT, on the other hand, does not use MVCC, so it is possible for two concurrent transactions to both try to insert the same row, which can lead to data corruption.
Overall, MERGE is a more powerful and flexible statement than INSERT ON CONFLICT. However, INSERT ON CONFLICT may be a better choice in some cases, such as when simplicity or reliability are more important than flexibility.

Here are some additional considerations when choosing between MERGE and INSERT ON CONFLICT:

The type of data you are working with. MERGE is generally better suited for working with data that is frequently updated, while INSERT ON CONFLICT may be a better choice for working with data that is less frequently updated.
The size of the table. MERGE is more efficient than INSERT ON CONFLICT for large tables.
The level of concurrency. MERGE is more reliable than INSERT ON CONFLICT in environments with a high level of concurrency.
Ultimately, the best way to decide which statement to use is to consider your specific requirements and use case.

pjpimentel commented 5 months ago

hello @half-metal and @porsager

i have faced the same issue today (more than one year later 😃)...

looks there some kind of issue/feature in the query builder when using a object and specifying some keys, so i had to transform the array of objects to an array of arrays before using it inside the query.

i do not know if you already found a fix... but for me, the following code works:

const data = [{id: '1', name: '2', color: '3'},{id: '4', name: '5', color: '6'}];
const updated_at_or_created_at = (new Date()).toISOString();
const insert_values = data.map((c) => [c.id, c.name, c.color]);

await sql_transaction`
    MERGE INTO public.your_table
    USING (VALUES ${
      sql_transaction(insert_values)
    })
    AS input (id, name, color)
    ON input.id = public.your_table.id
    WHEN MATCHED
      THEN UPDATE SET
        name = input.name,
        color = input.color,
        updated_at = ${updated_at_or_created_at}
    WHEN NOT MATCHED
      THEN
        INSERT (
          created_at,
          id,
          name,
          color
        ) VALUES(
          ${updated_at_or_created_at},
          input.id,
          input.name,
          input.color
        )
  `;

porsager / postgres

Merge and casting in dynamic queries #584