kagkarlsson / db-scheduler

Persistent cluster-friendly scheduler for Java
Apache License 2.0
1.23k stars 188 forks source link

Table schema primary key #527

Open m7yJpNXY opened 3 weeks ago

m7yJpNXY commented 3 weeks ago

Some input from a DBA (not me) regarding the table schema (mssql) that could perhaps be of interest.

Recommendation based on table analysis

Based on an analysis of the cardinality of the table columns and index definitions for the table dbo.scheduled_tasks I can see the following from the DBA perspective:

JonasNjopOlsson commented 3 weeks ago

The DBA in question would be me so if anything seems unclear regarding the recommendation, please let me know. We are seeing a high frequency of deadlocks in the Microsoft SQL Server environment due to the above issue with the table and index definitions.

kagkarlsson commented 3 weeks ago

Thank you for posting your observation. The SQL Server deadlock-issue has popped up before, and I think what is needed is this type of effort by actual users of the database. Surprising behavior to me, but not sure on where the actual deadlock occurs since I am more familiar with Postgres, where this is not a problem.

I agree with you that it would be slightly better to reverse the order of the fields in the primary key. Possibly significantly better for Sql Server (depending on how index-locking actually works there)?? Did you try creating an alternative index and see if that affected the number of deadlocks?

JonasNjopOlsson commented 3 weeks ago

@kagkarlsson: The problem we are seeing is when several concurrent SQL sessions are trying to update the same table with similar code. Examples below:

-- Session #1
--------------------------------------------------------------------------------------------------
-- Arguments: @P0 bit,@P1 nvarchar(4000),@P2 datetime2,@P3 bit,@P4 nvarchar(4000),@P5 nvarchar(4000),@P6 bigint
UPDATE scheduled_tasks
SET
    picked = @P0
    ,picked_by = @P1
    ,last_heartbeat = @P2
    ,version = version + 1
WHERE
    picked            = @P3
    AND task_name     = @P4
    AND task_instance = @P5
    AND version       = @P6;

-- Session #2
--------------------------------------------------------------------------------------------------
-- Arguments: @P0 bit,@P1 nvarchar(4000),@P2 datetime2,@P3 datetime2,@P4 datetime2,@P5 int,@P6 datetime2,@P7 nvarchar(4000),@P8 nvarchar(4000),@P9 bigint
UPDATE scheduled_tasks
SET
    picked = @P0
    ,picked_by = @P1
    ,last_heartbeat = @P2
    ,last_success = @P3
    ,last_failure = @P4
    ,consecutive_failures = @P5
    ,execution_time = @P6
    ,version = version + 1
WHERE
    task_name         = @P7
    AND task_instance = @P8
    AND version       = @P9;

Sessions which are running the two T-SQL statements above typically deadlock very frequently with each other in our SQL Server environment on the dbo.scheduled_tasks table.

I did not try changing the key column order in the index to alleviate the problem so far, since it is based on the PRIMARY KEY constraint of the table, but based on previous experience with similar issues it seems likely that the recommended approach would lead to much quicker access paths in the associated index to get to the right row for each UPDATE statement (fewer reads, shorter execution time, locks more likely to be taken on the specific row).

JonasNjopOlsson commented 2 weeks ago

@kagkarlsson @m7yJpNXY : We have now conducted internal tests using the same application code but with a changed table schema according to our suggestion above. We have not encountered a single deadlock in the database after the change was made and the jobs on the application side seem to be working better now. Our conclusion is that the recommendation works as expected and should be introduced into the main code base.