zzzprojects / Dapper-Plus

Dapper Plus - High-Efficient Bulk Actions (Insert, Update, Delete, and Merge) for .NET
https://dapper-plus.net/
380 stars 84 forks source link

BulkMerge + DELETE? #110

Closed VictorioBerra closed 2 years ago

VictorioBerra commented 2 years ago

MS SQL Server MERGE statement can apparently handle deletes https://docs.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql?view=sql-server-ver15#syntax

How can I INSERT, UPDATE and DELETE in one statement?

For example, given this list of Customer objects I have in C#:

var customersToMerge = new List<Customer> { new Customer{ Id = 1, name = "Bob"}, new Customer{ Id = 2, name = "Alice"}}

Some API like BulkMergeWithDelete(customersToMerge) would have this outcome on the table data:

Customer table in SQL Server

Id Name Outcome
1 Bob NO ACTION
2 Sandy NAME CHANGED TO Alice
3 Dave DELETED

Is there some way in code to handle this?

JonathanMagnan commented 2 years ago

Hello @VictorioBerra ,

Thank you for reporting, I will need to look at why we didn't have done it this way for the BulkSynchronize.

We had a reason at the time we made this feature, so we will need to investigate it to check if we can do your case for Dave.

Best Regards,

Jon

VictorioBerra commented 2 years ago

@JonathanMagnan thanks for the quick reply. I did not know the BulkSynchronize existed. Is there documentation for this?

JonathanMagnan commented 2 years ago

Hello @VictorioBerra ,

No, there is currently no documentation. This is for Entity Framework, but it's essentially the same: https://entityframework-extensions.net/bulk-synchronize

However, the BulkSynchronize do it in 2 phase:

My developer is currently looking what was our limitation about why we didn't perfect it all in the MERGE statement.

VictorioBerra commented 2 years ago

@JonathanMagnan for the record it is nice that I have the option to do the merge that won't delete. If you guys change this it might be nice to have a flag to use the new behavior. For example when I do partial updates I may want to preserve existing data instead of deleting it.

JonathanMagnan commented 2 years ago

Hello @VictorioBerra ,

BulkSynchronize

One reason why we didn't use this syntax in the BulkSynchronize is that simply doesn't work with a BatchSize.

The WHEN NOT MATCHED BY SOURCE THEN statement will delete more than it should as the MATCHED rows only contain those for the current batch and not from a subsequent batch. So the rows for the second batch will be deleted, then re-inserted when executing the second batch which is really bad, it even worse as the second batch will delete rows inserted/updated from the first batch.

So doing it one statement only works if we BulkSynchronize the whole source and not when we split it. So that explains why we didn't have done it this way with the BulkSynchronize.

However, could be a good idea for optimization on our side later if there is only 1 batch to execute and not many.

BulkMerge

We could add an option for this method, but do you really need it or it was only a suggestion?

For around 99,99% of scenarios, people are looking for BulkSynchronize when they also need to DELETE.

Having a scenario that you need to update for some row, and delete for some other row that is MATCHED is very rare.

Honestly, if we could avoid doing this request, I would rather do this than add some complexity to our template for a scenario that maybe will never be used.

Best Regards,

Jon

VictorioBerra commented 2 years ago

@JonathanMagnan Lets stick with the current functionality, and make no changes at all.

When I initially created this issue, I did not even know the BulkSynchronize() method existed, so I fear I may have kicked off a wild goose chase. I think maybe its good that we now have an issue that documents why these decisions were made, and how the inner workings of the BulkSynchronize() work in regards to BatchSize.