hail-is / hail

Cloud-native genomic dataframes and batch computing
https://hail.is
MIT License
965 stars 241 forks source link

[batch] Three-transaction deadlock suspected when deactivating instances #14380

Open daniel-goldstein opened 6 months ago

daniel-goldstein commented 6 months ago

What happened?

Most stored procedures take either a shared or exclusive lock on a relevant row of the jobs table near the start of the procedure, but not all. This appears to interact poorly with the attempts_after_update trigger as it attempts to take an exclusive lock on rows in the jobs table in the below join with the attempt resources tables. It's not clear exactly what the right fix is. It should be simple enough not to join on the jobs table in the FOR UPDATE, but we should also evaluate when in our various transactions a lock should be taken on the jobs table and whether it should be an X or S lock.

Version

0.2.128

Relevant log output

------------------------
LATEST DETECTED DEADLOCK
------------------------
2024-02-29 15:07:05 140331971655424
*** (1) TRANSACTION:
TRANSACTION 2486515, ACTIVE 0 sec inserting
mysql tables in use 27, locked 27
LOCK WAIT 16 lock struct(s), heap size 1128, 9 row lock(s), undo log entries 1
MySQL thread id 703, OS thread handle 140330830395136, query id 4745489 10.32.3.39 dgoldste-batch-user executing
INSERT INTO aggregated_billing_project_user_resources_v3 (billing_project, user, resource_id, token, `usage`)
    SELECT cur_billing_project, cur_user,
      attempt_resources.deduped_resource_id,
      rand_token,
      msec_diff_rollup * quantity
    FROM attempt_resources
    WHERE attempt_resources.batch_id = NEW.batch_id AND attempt_resources.job_id = NEW.job_id AND attempt_id = NEW.attempt_id
    FOR UPDATE
    ON DUPLICATE KEY UPDATE `usage` = aggregated_billing_project_user_resources_v3.`usage` + msec_diff_rollup * quantity

*** (1) HOLDS THE LOCK(S):
RECORD LOCKS space id 351 page no 4 n bits 248 index PRIMARY of table `dgoldste-batch`.`instances_free_cores_mcpu` trx id 2486515 lock_mode X locks rec but not gap
Record lock, heap no 176 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 30; hex 62617463682d776f726b65722d64676f6c647374652d7374616e64617264; asc batch-worker-dgoldste-standard; (total 36 bytes);
 1: len 6; hex 00000025f0ee; asc    %  ;;
 2: len 7; hex 01000000a90a9b; asc        ;;
 3: len 4; hex 80000fa0; asc     ;;

*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 376 page no 8 n bits 408 index PRIMARY of table `dgoldste-batch`.`aggregated_billing_project_user_resources_v3` trx id 2486515 lock_mode X locks rec but not gap waiting
Record lock, heap no 228 PHYSICAL RECORD: n_fields 7; compact format; info bits 0
 0: len 4; hex 74657374; asc test;;
 1: len 8; hex 64676f6c64737465; asc dgoldste;;
 2: len 4; hex 80000009; asc     ;;
 3: len 4; hex 80000034; asc    4;;
 4: len 6; hex 00000025f0cd; asc    %  ;;
 5: len 7; hex 810000021b01cd; asc        ;;
 6: len 8; hex 80000000001b09e0; asc         ;;

*** (2) TRANSACTION:
TRANSACTION 2486477, ACTIVE 0 sec starting index read
mysql tables in use 27, locked 27
LOCK WAIT 47 lock struct(s), heap size 8312, 215 row lock(s), undo log entries 211
MySQL thread id 682, OS thread handle 140330866251520, query id 4746389 10.32.3.39 dgoldste-batch-user executing
INSERT INTO aggregated_job_group_resources_v3 (batch_id, job_group_id, resource_id, token, `usage`)
    SELECT attempt_resources.batch_id,
      job_group_self_and_ancestors.ancestor_id,
      attempt_resources.deduped_resource_id,
       NAME_CONST('rand_token',189),
       NAME_CONST('msec_diff_rollup',1671) * quantity
    FROM attempt_resources
    LEFT JOIN jobs ON attempt_resources.batch_id = jobs.batch_id AND attempt_resources.job_id = jobs.job_id
    LEFT JOIN job_group_self_and_ancestors ON jobs.batch_id = job_group_self_and_ancestors.batch_id AND jobs.job_group_id = job_group_self_and_ancestors.job_group_id
    WHERE attempt_resources.batch_id = NEW.batch_id AND attempt_resources.job_id = NEW.job_id AND attempt_resources.attempt_id = NEW.attempt_id
    FOR UPDATE
    ON DUPLICATE KEY UPDATE `usage` = aggregated_job_group_resources_v3.`usage` +  NAME_CONST('msec_diff_rollup',1671) * quanti

*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 376 page no 8 n bits 408 index PRIMARY of table `dgoldste-batch`.`aggregated_billing_project_user_resources_v3` trx id 2486477 lock_mode X locks rec but not gap
Record lock, heap no 228 PHYSICAL RECORD: n_fields 7; compact format; info bits 0
 0: len 4; hex 74657374; asc test;;
 1: len 8; hex 64676f6c64737465; asc dgoldste;;
 2: len 4; hex 80000009; asc     ;;
 3: len 4; hex 80000034; asc    4;;
 4: len 6; hex 00000025f0cd; asc    %  ;;
 5: len 7; hex 810000021b01cd; asc        ;;
 6: len 8; hex 80000000001b09e0; asc         ;;

Record lock, heap no 249 PHYSICAL RECORD: n_fields 7; compact format; info bits 0
 0: len 4; hex 74657374; asc test;;
 1: len 8; hex 64676f6c64737465; asc dgoldste;;
 2: len 4; hex 80000009; asc     ;;
 3: len 4; hex 800000ad; asc     ;;
 4: len 6; hex 00000025f0cd; asc    %  ;;
 5: len 7; hex 010000008c050d; asc        ;;
 6: len 8; hex 8000000000471350; asc      G P;;

Record lock, heap no 266 PHYSICAL RECORD: n_fields 7; compact format; info bits 0
 0: len 4; hex 74657374; asc test;;
 1: len 8; hex 64676f6c64737465; asc dgoldste;;
 2: len 4; hex 80000009; asc     ;;
 3: len 4; hex 8000008b; asc     ;;
 4: len 6; hex 00000025f0cd; asc    %  ;;
 5: len 7; hex 010000008c0bd1; asc        ;;
 6: len 8; hex 80000000004c0108; asc      L  ;;

Record lock, heap no 273 PHYSICAL RECORD: n_fields 7; compact format; info bits 0
 0: len 4; hex 74657374; asc test;;
 1: len 8; hex 64676f6c64737465; asc dgoldste;;
 2: len 4; hex 8000000a; asc     ;;
 3: len 4; hex 80000038; asc    8;;
 4: len 6; hex 00000025f0cd; asc    %  ;;
 5: len 7; hex 010000008c17e1; asc        ;;
 6: len 8; hex 8000000000370dc0; asc      7  ;;

Record lock, heap no 284 PHYSICAL RECORD: n_fields 7; compact format; info bits 0
 0: len 4; hex 74657374; asc test;;
 1: len 8; hex 64676f6c64737465; asc dgoldste;;
 2: len 4; hex 80000009; asc     ;;
 3: len 4; hex 800000bd; asc     ;;
 4: len 6; hex 00000025f0cd; asc    %  ;;
 5: len 7; hex 010000008c204f; asc       O;;
 6: len 8; hex 8000000000427f70; asc      B p;;

Record lock, heap no 288 PHYSICAL RECORD: n_fields 7; compact format; info bits 0
 0: len 4; hex 74657374; asc test;;
 1: len 8; hex 64676f6c64737465; asc dgoldste;;
 2: len 4; hex 80000009; asc     ;;
 3: len 4; hex 80000038; asc    8;;
 4: len 6; hex 00000025f0cd; asc    %  ;;
 5: len 7; hex 010000008c17a6; asc        ;;
 6: len 8; hex 8000000000370dc0; asc      7  ;;

Record lock, heap no 316 PHYSICAL RECORD: n_fields 7; compact format; info bits 0
 0: len 4; hex 74657374; asc test;;
 1: len 8; hex 64676f6c64737465; asc dgoldste;;
 2: len 4; hex 80000009; asc     ;;
 3: len 4; hex 8000008f; asc     ;;
 4: len 6; hex 00000025f0cd; asc    %  ;;
 5: len 7; hex 010000008c10d0; asc        ;;
 6: len 8; hex 800000000030c078; asc      0 x;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 312 page no 896 n bits 168 index PRIMARY of table `dgoldste-batch`.`jobs` trx id 2486477 lock_mode X locks rec but not gap waiting
Record lock, heap no 27 PHYSICAL RECORD: n_fields 18; compact format; info bits 128
 0: len 8; hex 8000000000000095; asc         ;;
 1: len 4; hex 80002156; asc   !V;;
 2: len 6; hex 00000025eff8; asc    %  ;;
 3: len 7; hex 020000018e1add; asc        ;;
 4: len 7; hex 52756e6e696e67; asc Running;;
 5: len 30; hex 5b5b5b2264676f6c64737465222c202264676f6c647374652d6773612d6b; asc [[["dgoldste", "dgoldste-gsa-k; (total 69 bytes);
 6: len 1; hex 80; asc  ;;
 7: len 4; hex 800003e8; asc     ;;
 8: SQL NULL;
 9: len 4; hex 80000000; asc     ;;
 10: len 1; hex 80; asc  ;;
 11: len 8; hex 8000000000000000; asc         ;;
 12: len 6; hex 395664676a4c; asc 9VdgjL;;
 13: len 8; hex 7374616e64617264; asc standard;;
 14: len 4; hex 80000001; asc     ;;
 15: len 4; hex 80000001; asc     ;;
 16: len 8; hex 8000000000000004; asc         ;;
 17: SQL DEFAULT;

*** (3) TRANSACTION:
TRANSACTION 2486516, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 6 lock struct(s), heap size 1128, 3 row lock(s)
MySQL thread id 519, OS thread handle 140330967815936, query id 4742236 10.32.3.39 dgoldste-batch-user statistics
SELECT 1 INTO dummy_lock FROM instances_free_cores_mcpu
    WHERE instances_free_cores_mcpu.name = in_instance_name
    FOR UPDATE

*** (3) HOLDS THE LOCK(S):
RECORD LOCKS space id 312 page no 896 n bits 168 index PRIMARY of table `dgoldste-batch`.`jobs` trx id 2486516 lock_mode X locks rec but not gap
Record lock, heap no 27 PHYSICAL RECORD: n_fields 18; compact format; info bits 128
 0: len 8; hex 8000000000000095; asc         ;;
 1: len 4; hex 80002156; asc   !V;;
 2: len 6; hex 00000025eff8; asc    %  ;;
 3: len 7; hex 020000018e1add; asc        ;;
 4: len 7; hex 52756e6e696e67; asc Running;;
 5: len 30; hex 5b5b5b2264676f6c64737465222c202264676f6c647374652d6773612d6b; asc [[["dgoldste", "dgoldste-gsa-k; (total 69 bytes);
 6: len 1; hex 80; asc  ;;
 7: len 4; hex 800003e8; asc     ;;
 8: SQL NULL;
 9: len 4; hex 80000000; asc     ;;
 10: len 1; hex 80; asc  ;;
 11: len 8; hex 8000000000000000; asc         ;;
 12: len 6; hex 395664676a4c; asc 9VdgjL;;
 13: len 8; hex 7374616e64617264; asc standard;;
 14: len 4; hex 80000001; asc     ;;
 15: len 4; hex 80000001; asc     ;;
 16: len 8; hex 8000000000000004; asc         ;;
 17: SQL DEFAULT;

*** (3) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 351 page no 4 n bits 248 index PRIMARY of table `dgoldste-batch`.`instances_free_cores_mcpu` trx id 2486516 lock_mode X locks rec but not gap waiting
Record lock, heap no 176 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 30; hex 62617463682d776f726b65722d64676f6c647374652d7374616e64617264; asc batch-worker-dgoldste-standard; (total 36 bytes);
 1: len 6; hex 00000025f0ee; asc    %  ;;
 2: len 7; hex 01000000a90a9b; asc        ;;
 3: len 4; hex 80000fa0; asc     ;;

*** WE ROLL BACK TRANSACTION (3)
chrisvittal commented 6 months ago

From triage discussion.

What this means: Instance destruction is slower than it needs to be, this can impact throughput but does not really impact reliability, and is uncommon enough to not be high priority at this time.