zzzprojects / Dapper-Plus

Dapper Plus - High-Efficient Bulk Actions (Insert, Update, Delete, and Merge) for .NET
https://dapper-plus.net/
380 stars 84 forks source link

PostgreSQL BulkMerge / BulkUpdate always returns all rows as update #140

Closed cupper1 closed 5 months ago

cupper1 commented 5 months ago

Description

This seems very simple but somehow I am lost, so sorry if this is simple!

I have a simple table 2 fields ID and FirstName. I have a json file that has 500 records of ID and Firstname, I use jsonconvert to pull it into list of objects.

I then do this:

try
            {
                string filecontents = File.ReadAllText(filenametopull);
                List<LARCObjects.Patient> patients = new List<LARCObjects.Patient> { };
                patients = JsonConvert.DeserializeObject<List<LARCObjects.Patient>>(filecontents);

                if (patients.Count > 0)
                {
                    List<AuditEntry> auditEntries = new List<AuditEntry>();
                    DapperPlusManager.Entity<LARCObjects.Patient>().Table("Patients2")
                                      //   .Identity(x => x.PatientID)
                                         .Key(x => x.PatientID)

                                         .UseBulkOptions(x =>
                                         {
                                             x.AuditEntries = auditEntries;
                                             x.UseAudit = true;
                                             x.InsertIfNotExists = true;

                                         });

                    atrun.totalrows = patients.Count;
                    using (var connection = new NpgsqlConnection(configuration["pconnect"]))
                    {

                        connection.BulkMerge(patients);
                        //connection.BulkUpdate(patients);

                    }
                }

on first run, all goes well, and when reviewing auditentries, it shows 500 rows all with "insert", so that makes sense. on second run, of exact same file, everything runs the same, no errors, but then when reviewing auditentries, I have 500 rows all with "Update", and when reviewing no difference in newvalue / oldvalue.

But there is an additional field added called "xmin" which seems an internal field used by postgres. in this field for every record, there is a slight difference (like oldvalue: 1403 , newvalue: 1502)

Is this the expected behavior? I would have expected 0 new rows from the auditentries as nothing has changed.

Further technical details

JonathanMagnan commented 5 months ago

Hello @cupper1 ,

The BulkMerge inserts data that doesn't exist and updates those that already exist.

So it's expected that 500 rows will be returned on the second run as 500 rows have been updated (already existing).

However, there is 2 small errors in your code:

Perhaps you were looking more for a BulkInsert with the InsertIfNotExists option?

NOTE: The Xmin is an internal column in PostgreSQL.

Let me know if I answered correctly your question

Best Regards,

Jon

cupper1 commented 5 months ago

Hi @JonathanMagnan Thanks for the response, a follow up: so, on the second run, where all the data is exactly the same, no data was actually updated, so I would have expected the audit log to show 0 entries, as nothing was changed.

Maybe I am going down the wrong path, and is best to explain the use case:

I get the file every day, it has 500 (or 15k) rows in it, almost all the data is exactly the same. Right now I query the database for each row, bring it back, check it against the row in the file and if there is a difference, I update it, if not I move the the next row.

I was expecting that is what this would do for me, and give me an audit log of any rows that have changed. Is this capable of that?

for example, 500 rows from the exact same file has no changes, so audit log would show 0 rows. tomorrow 1 row is updated with a new firstname, so audit log shows 1 rows, oldvalue / newvalue

if not, is there a way of filtering the audit log where oldvalue !=newvalue ?

Thank you again for the response, I am hoping this product will do this as its much faster than 1 by 1 :)

JonathanMagnan commented 5 months ago

Hello @cupper1 ,

If I understand your scenario, we indeed support it. You can use either the option MergeMatchedAndOneNotConditionExpression or IgnoreOnMergeMatchedAndOneNotConditionExpression to achieve this behavior. You can learn more about those option here (Documentation is about Entity Framework Extensions but it's the same behavior for Dapper Plus)

See the following online Fiddle: https://dotnetfiddle.net/3lyezo

DapperPlusManager.Entity<Patient>("MergeMatchedAndOneNot").Identity(x => x.PatientID)
    .UseBulkOptions(options => {
        // Use this line if you want to specify all properties. One of them must be different for the "Update" part to be executed
        //options.MergeMatchedAndOneNotConditionExpression = x => new { x.Name, x.Description };

        // Use this line if you want to include all properties by default (we specify the key as you can always skip this one, will never be different)
        options.IgnoreOnMergeMatchedAndOneNotConditionExpression = x => new { x.PatientID };
    });

You will notice on my Fiddle:

As you can see in the result:

Let me know if that's what you were looking for.

Best Regards,

Jon

cupper1 commented 5 months ago

Hi @JonathanMagnan

THANK YOU! yes this is (almost) exactly what I am looking for! and it works great, I just would like to ask one more question. In the example, the auditentries shows 1 row, but all fields still. So for instance if my schema had 10 columns, and only one changed, it returns all 10 columns with old/new fields. I am just curious if you have another magic option to only return the fields that changed instead?

Thank you again for this!

JonathanMagnan commented 5 months ago

Hello @cupper1 ,

You currently have the OldValue and NewValue.

So, for this one, you will have to make a filter on your side.

Something like this:

auditEntries.ForEach(x => x.Values = x.Values.Where(y => y.OldValue.ToString() != y.NewValue.ToString()).ToList());

Let me know if that works on your side.

Best Regards,

Jon

cupper1 commented 5 months ago

Thank you again @JonathanMagnan really appreciate you!