mganss / SyncChanges

Synchronize/Replicate database changes using SQL Server Change Tracking
MIT License
124 stars 49 forks source link

Transactions + foreign key constraints #230

Closed ValtsS closed 1 month ago

ValtsS commented 2 months ago

Recently I ran into an issue where the current logic of the foreign key constraint check enable/disable is not sufficient.

Why this happens - ComputeForeignKeyConstraintsToDisable checks only for changes somewhere in the future and does the workaround by disabling constraints and re-enabling them afterwards. To solve this issue in my fork I turn off all the keys involved in the changes - this is not a great approach, because in my case there are a lot of them.

What would be the correct approach - group the changes together by createVersion & version (thus they are in a single transaction). Then do a topological sort (graph of FKs) to reorder changes correctly, taking into account is it DELETE or INSERT or UPDATE. I struggle to come up with such logic at this time.

mganss commented 2 months ago

Thanks for bringing this up. Can you describe the issue in more detail where the current logic is insufficient? And perhaps give an example?

ValtsS commented 2 months ago

Thanks for getting back to me. So here is the artificial simplest example:

  1. create a database with change tracking enabled
  2. create tables, add one foreign key:
CREATE TABLE Product (id INT PRIMARY KEY)

CREATE TABLE Orders (
    id INT identity(1, 1) PRIMARY KEY
    ,productId INT NOT NULL
    ,CONSTRAINT FK_ProductId FOREIGN KEY (productId) REFERENCES Product(Id)
    )
GO

ALTER TABLE Product ENABLE CHANGE_TRACKING
    WITH (TRACK_COLUMNS_UPDATED = OFF);

ALTER TABLE Orders ENABLE CHANGE_TRACKING
    WITH (TRACK_COLUMNS_UPDATED = OFF);
GO
  1. Make two inserts, wrap them into transaction to make it atomic:
    
    BEGIN TRANSACTION

INSERT INTO Product (id) VALUES (1)

INSERT INTO Orders (productId) VALUES (1)

COMMIT


4. Query the changes for both tables - same query as program uses:

SELECT c.SYS_CHANGE_OPERATION ,c.SYS_CHANGE_VERSION ,c.SYS_CHANGE_CREATION_VERSION ,c.[Id] FROM CHANGETABLE(CHANGES [dbo].Product, 0) c LEFT OUTER JOIN [dbo].Product t ON c.[Id] = t.[Id] ORDER BY coalesce(c.SYS_CHANGE_CREATION_VERSION, c.SYS_CHANGE_VERSION) ,c.SYS_CHANGE_OPERATION

SELECT c.SYS_CHANGE_OPERATION ,c.SYS_CHANGE_VERSION ,c.SYS_CHANGE_CREATION_VERSION ,c.[Id] ,t.productId FROM CHANGETABLE(CHANGES [dbo].Orders, 0) c LEFT OUTER JOIN [dbo].Orders t ON c.[Id] = t.[Id] ORDER BY coalesce(c.SYS_CHANGE_CREATION_VERSION, c.SYS_CHANGE_VERSION) ,c.SYS_CHANGE_OPERATION


Results:
![image](https://github.com/user-attachments/assets/31b221cb-80ab-4e68-b3d9-0ea568600f93)

5. Notice the create versions are the same and equal 1, same for the change version IDs.

When we go into https://github.com/mganss/SyncChanges/blob/334c79173372483e039a87b5c89b9bd89661527c/SyncChanges/Synchronizer.cs#L513 it will sort by the creation version and will do nothing about it since change version and creation version are equal. 
And since https://github.com/mganss/SyncChanges/blob/334c79173372483e039a87b5c89b9bd89661527c/SyncChanges/Synchronizer.cs#L504C17-L504C100 sorts by Table.Name - then only table names that determines will the inserts work or fail with

System.Data.SqlClient.SqlException (0x80131904): The INSERT statement conflicted with the FOREIGN KEY constraint "....".



It will get more complex when there are UPDATES or DELETES involved.
mganss commented 1 month ago

Thanks a lot for providing the example.

I think the fix you propose wouldn't work, unfortunately, because for some cases there does not exist an order of changes that would satisfy the foreign key constraints at every step. This is because some information about the original insert value is lost, see the explanation here: https://github.com/mganss/SyncChanges?tab=readme-ov-file#foreign-key-constraints

My current idea is to disable all foreign key constraints that are involved in changes that occur during the same transaction, i.e. where creation version and version are equal. Do you think there are cases where this would not suffice?

Also, my understanding is that the issue is only with inserts and subsequent updates to the same row. The updates do not show up as updates per se but are merged into the initial insert but with the value of the last update. Do you have an example where deletes or updates would also create issues?

It's a shame that SQL Server doesn't support deferred constraints, esp. considering SQL Server's own change tracking feature creates this issue that would require them: https://stackoverflow.com/a/5974847/1970064

ValtsS commented 1 month ago

Thank you, I don't know this deep enough to see all the consequences.

My current idea is to disable all foreign key constraints that are involved in changes that occur during the same transaction, i.e. where creation version and version are equal. Do you think there are cases where this would not suffice?

Sounds like it could work. I am dealing with very active databases with sometimes a lot of foreign keys - so I went for simpler solution - calculate involved keys for all the changes (but don't check actual values). Then disable all of them and re-enable at the end.

I don't have any other examples I am afraid, since I needed to solve the issue asap, so I took shortcuts. https://github.com/ValtsS/SyncChanges/blob/64c2934fd227adb943d78a17109e522e5c169bb0/SyncChanges/Synchronizer.cs#L643

P.S. Also, my fork requires snapshot support, because I think without it there are chances of data inconsistency otherwise. P.P.S. also I recently added distributed sql lock for peace of mind.

mganss commented 1 month ago

OK thanks for your input. I'm gonna go with the idea of disabling foreign key constraints for all changes in a transaction for now.