lobsters / lobsters-ansible

Ansible playbook for lobste.rs
ISC License
78 stars 25 forks source link

mariadb deadlocked occasionally #39

Open pushcx opened 5 years ago

pushcx commented 5 years ago

In the last couple weeks, every couple days I get an exception from a query failing with a complaint about a deadlock:

An ActiveRecord::Deadlocked occurred in comments#upvote:

  Mysql2::Error: Deadlock found when trying to get lock; try restarting transaction: UPDATE `stories` SET `hotness` = -19572.7194911 WHERE `stories`.`id` = 51235
  app/models/story.rb:608:in `recalculate_hotness!'

-------------------------------
Request:
-------------------------------

  * URL        : https://lobste.rs/comments/evgzzm/upvote
  * HTTP Method: POST
  * IP address : [elided]
  * Parameters : {"controller"=>"comments", "action"=>"upvote", "id"=>"evgzzm"}
  * Timestamp  : 2019-02-13 08:23:33 -0600
  * Server : lobsters
  * Rails root : /srv/lobste.rs/http
  * Process: 3547

(Headers, env vars, and traceback follow, but there's nothing interesting there.) This almost always happens on an story or comment upvote (the most common kind of write on this table), but I have seen a couple on creating a comment.

I poked around a bit and saw someone suggest running SHOW ENGINE INNODB STATUS;, so here's that output:

MariaDB [(none)]> SHOW ENGINE INNODB STATUS;
[trimmed wide header line at top and bottom]
| InnoDB |      | 
=====================================
2019-02-14 02:50:38 7fafde560700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 56 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 64483 srv_active, 0 srv_shutdown, 7042 srv_idle
srv_master_thread log flush and writes: 71519
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 661850
OS WAIT ARRAY INFO: signal count 2360986
Mutex spin waits 2387954, rounds 8308531, OS waits 149051
RW-shared spins 3107665, rounds 26266071, OS waits 444386
RW-excl spins 1617387, rounds 9021246, OS waits 54266
Spin rounds per wait: 3.48 mutex, 8.45 RW-shared, 5.58 RW-excl
------------------------
LATEST DETECTED DEADLOCK
------------------------
2019-02-13 14:23:33 7fafdd5c8700
*** (1) TRANSACTION:
TRANSACTION 425363216, ACTIVE 0 sec starting index read
mysql tables in use 11, locked 11
LOCK WAIT 13 lock struct(s), heap size 2936, 6 row lock(s), undo log entries 3
MySQL thread id 259, OS thread handle 0x7fafdff0d700, query id 2274118 localhost lobsters updating
UPDATE `stories` SET `hotness` = -19572.7194911 WHERE `stories`.`id` = 51235
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 530 page no 41135 n bits 72 index `PRIMARY` of table `lobsters`.`stories` trx table locks 7 total table locks 5  trx id 425363216 lock_mode X locks rec but not gap waiting lock hold time 0 wait time before grant 0 
*** (2) TRANSACTION:
TRANSACTION 425363226, ACTIVE 0 sec starting index read
mysql tables in use 11, locked 11
13 lock struct(s), heap size 2936, 6 row lock(s), undo log entries 3
MySQL thread id 331, OS thread handle 0x7fafdd5c8700, query id 2274152 localhost lobsters updating
UPDATE `stories` SET `hotness` = -19572.7194911 WHERE `stories`.`id` = 51235
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 530 page no 41135 n bits 72 index `PRIMARY` of table `lobsters`.`stories` trx table locks 7 total table locks 5  trx id 425363226 lock mode S locks rec but not gap lock hold time 0 wait time before grant 0 
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 530 page no 41135 n bits 72 index `PRIMARY` of table `lobsters`.`stories` trx table locks 7 total table locks 5  trx id 425363226 lock_mode X locks rec but not gap waiting lock hold time 0 wait time before grant 0 
*** WE ROLL BACK TRANSACTION (2)
------------
TRANSACTIONS
------------
Trx id counter 426863168
Purge done for trx's n:o < 426863163 undo n:o < 0 state: running but idle
History list length 3265
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0, not started
MySQL thread id 1198, OS thread handle 0x7fafde560700, query id 4494786 localhost root init
SHOW ENGINE INNODB STATUS
---TRANSACTION 426863130, not started
MySQL thread id 1046, OS thread handle 0x7fafde485700, query id 4494731 localhost lobsters 
---TRANSACTION 426862942, not started
MySQL thread id 1045, OS thread handle 0x7fafe0031700, query id 4494456 localhost lobsters 
---TRANSACTION 426863073, not started
MySQL thread id 1040, OS thread handle 0x7fafdffe8700, query id 4494658 localhost lobsters 
---TRANSACTION 426863001, not started
MySQL thread id 1039, OS thread handle 0x7fafdd5c8700, query id 4494556 localhost lobsters 
---TRANSACTION 426863160, not started
MySQL thread id 1038, OS thread handle 0x7fafdd57f700, query id 4494775 localhost lobsters 
---TRANSACTION 426863057, not started
MySQL thread id 1037, OS thread handle 0x7fafdff9f700, query id 4494638 localhost lobsters 
---TRANSACTION 426863154, not started
MySQL thread id 1036, OS thread handle 0x7fafde5a9700, query id 4494766 localhost lobsters 
---TRANSACTION 426863148, not started
MySQL thread id 1035, OS thread handle 0x7fafdff56700, query id 4494763 localhost lobsters 
---TRANSACTION 426863070, not started
MySQL thread id 1034, OS thread handle 0x7fafde517700, query id 4494655 localhost lobsters 
---TRANSACTION 426863146, not started
MySQL thread id 1033, OS thread handle 0x7fafde4ce700, query id 4494750 localhost lobsters 
---TRANSACTION 426863013, not started
MySQL thread id 1032, OS thread handle 0x7fafdd536700, query id 4494574 localhost lobsters 
---TRANSACTION 426863167, not started
MySQL thread id 1031, OS thread handle 0x7fafdff0d700, query id 4494785 localhost lobsters 
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
I/O thread 5 state: waiting for completed aio requests (read thread)
I/O thread 6 state: waiting for completed aio requests (write thread)
I/O thread 7 state: waiting for completed aio requests (write thread)
I/O thread 8 state: waiting for completed aio requests (write thread)
I/O thread 9 state: waiting for completed aio requests (write thread)
Pending normal aio reads: 0 [0, 0, 0, 0] , aio writes: 0 [0, 0, 0, 0] ,
 ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 0
112895527 OS file reads, 356460 OS file writes, 205409 OS fsyncs
936.29 reads/s, 16384 avg bytes/read, 5.84 writes/s, 3.57 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 236, seg size 238, 2024 merges
merged operations:
 insert 1447, delete mark 830, delete 129
discarded operations:
 insert 0, delete mark 0, delete 0
7026.09 hash searches/s, 5144.98 non-hash searches/s
---
LOG
---
Log sequence number 29636994503
Log flushed up to   29636994503
Pages flushed up to 29636994037
Last checkpoint at  29636992778
Max checkpoint age    80826164
Checkpoint age target 78300347
Modified age          466
Checkpoint age        1725
0 pending log writes, 0 pending chkp writes
96495 log i/o's done, 1.54 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 139722752; in additional pool allocated 0
Total memory allocated by read views 1952
Internal hash tables (constant factor + variable factor)
    Adaptive hash index 4265568     (2213368 + 2052200)
    Page hash           139112 (buffer pool 0 only)
    Dictionary cache    1109166     (554768 + 554398)
    File system         863856  (812272 + 51584)
    Lock system         337552  (332872 + 4680)
    Recovery system     0   (0 + 0)
Dictionary memory allocated 554398
Buffer pool size        8191
Buffer pool size, bytes 134201344
Free buffers            1024
Database pages          7042
Old database pages      2579
Modified db pages       5
Percent of dirty pages(LRU & free pages): 0.062
Max dirty pages percent: 75.000
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 6226125, not young 1217390191
90.44 youngs/s, 15012.66 non-youngs/s
Pages read 112895763, created 1725, written 212916
936.29 reads/s, 0.00 creates/s, 3.52 writes/s
Buffer pool hit rate 976 / 1000, young-making rate 2 / 1000 not 376 / 1000
Pages read ahead 5.64/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 7042, unzip_LRU len: 0
I/O sum[52262]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
0 read views open inside InnoDB
0 RW transactions active inside InnoDB
0 RO transactions active inside InnoDB
0 out of 1000 descriptors used
Main thread process no. 19654, id 140392863352576, state: sleeping
Number of rows inserted 3971, updated 92170, deleted 15, read 976473763
0.07 inserts/s, 1.62 updates/s, 0.00 deletes/s, 12822.77 reads/s
Number of system rows inserted 0, updated 0, deleted 0, read 2
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
 |

Given our generally low db usage we probably should never see errors like this. Can anyone read this to guess at what's misconfigured?

pushcx commented 5 years ago

Attempt at addressing this: https://github.com/lobsters/lobsters/commit/f0549673b447bba97553c346efdaf71f5194c4e0

pushcx commented 5 years ago

This is still present, unfortunately. We've seen three of them on comment creation since the previous commit.

pushcx commented 4 years ago

FWIW, they've entirely stopped happening in voting code since that tweak. Strong suggestion this only happens inside of transactions.

pushcx commented 3 years ago

Haven't seen this in a while. Or the lobsters/lobsters#783, now that I think about it.

pushcx commented 11 months ago

Dammit. We're immediately back to this.

An ActiveRecord::Deadlocked occurred in comments#create:

  Mysql2::Error: Deadlock found when trying to get lock; try restarting transaction
  app/models/story.rb:819:in `update_comments_count!'

-------------------------------
Request:
-------------------------------

  * URL        : https://lobste.rs/comments
  * HTTP Method: POST
  * IP address : [elided]
  * Parameters : {"authenticity_token"=>"[FILTERED]", "story_id"=>"eprtpo", "_method"=>"post", "parent_comment_short_id"=>"[elided]", "comment"=>"[elided] ", "hat_id"=>"", "show_tree_lines"=>"true", "controller"=>"comments", "action"=>"create"}
  * Timestamp  : 2023-09-14 20:22:55 -0500
  * Server : lobste.rs
    * Rails root : /srv/lobste.rs/http
  * Process: 1072491
pushcx commented 11 months ago

I've been rerunning SHOW ENGINE INNODB STATUS; as I get these 500s about once a week. They're always related to comment creation, and all but one have included a story or comment. My current best guess is that https://github.com/lobsters/lobsters/pull/899 introduced this. I don't remember what problem we had at the time to add the transaction. (@hmadison, do you?)

Comment creation triggers a lot of callbacks and writes to stories because comment scores influence story scores. Here's a quick log from dev:

Started POST "/comments" for 127.0.0.1 at 2023-10-02 18:33:14 -0500                                                                                                                                                                                                                                                                                                                                                              [8/42018]
Processing by CommentsController#create as */*
  Parameters: {"authenticity_token"=>"[FILTERED]", "story_id"=>"5jpbgl", "comment"=>"asdfasdf", "hat_id"=>"", "show_tree_lines"=>"true"}
  User Load (0.5ms)  SELECT `users`.* FROM `users` WHERE `users`.`session_token` = 'MICkvIaRlgk01ZZtjdHh8DWGaJGZ94VfZcdGddBeDBrIK5FKR3U6uYK8ZhIt' LIMIT 1 /*controller:comments,action:create*/
  ↳ app/controllers/application_controller.rb:46:in `authenticate_user'
  Request 127.0.0.1 POST /comments user: 78 pushcx
  Keystore Pluck (0.3ms)  SELECT `keystores`.`value` FROM `keystores` WHERE `keystores`.`key` = 'traffic:intensity' LIMIT 1 /*controller:comments,action:create*/
  ↳ app/models/keystore.rb:15:in `value_for'
  Story Load (0.2ms)  SELECT `stories`.* FROM `stories` WHERE `stories`.`short_id` = '5jpbgl' ORDER BY `stories`.`id` ASC LIMIT 1 /*controller:comments,action:create*/
  ↳ app/controllers/comments_controller.rb:18:in `create'
  User Load (0.2ms)  SELECT `users`.* FROM `users` WHERE `users`.`id` = 16063 LIMIT 1 /*controller:comments,action:create*/
  ↳ app/models/story.rb:520:in `is_gone?'
  Comment Load (1.6ms)  SELECT `comments`.* FROM `comments` WHERE `comments`.`user_id` = 78 AND `comments`.`story_id` = 98554 AND `comments`.`parent_comment_id` IS NULL AND `comments`.`comment` = 'asdfasdf' LIMIT 1 /*controller:comments,action:create*/
  ↳ app/controllers/comments_controller.rb:41:in `create'
  Comment Exists? (0.8ms)  SELECT 1 AS one FROM `comments` WHERE `comments`.`short_id` = 'gtwfqv' LIMIT 1 /*controller:comments,action:create*/
  ↳ app/models/short_id.rb:40:in `valid?'
  TRANSACTION (0.3ms)  BEGIN /*controller:comments,action:create*/
  ↳ app/models/keystore.rb:50:in `block in incremented_value_for'
   (2.2ms)  INSERT INTO keystores (`key`, `value`) VALUES ('thread_id', 1) ON DUPLICATE KEY UPDATE `value` = `value` + 1 /*controller:comments,action:create*/
  ↳ app/models/keystore.rb:50:in `block in incremented_value_for'
  Keystore Pluck (0.3ms)  SELECT `keystores`.`value` FROM `keystores` WHERE `keystores`.`key` = 'thread_id' LIMIT 1 /*controller:comments,action:create*/
  ↳ app/models/keystore.rb:15:in `value_for'
  TRANSACTION (6.7ms)  COMMIT /*controller:comments,action:create*/
  ↳ app/models/keystore.rb:42:in `incremented_value_for'
  TRANSACTION (0.3ms)  BEGIN /*controller:comments,action:create*/
  ↳ app/models/short_id.rb:40:in `valid?'
  Comment Exists? (0.6ms)  SELECT 1 AS one FROM `comments` WHERE `comments`.`short_id` = 'w03bow' LIMIT 1 /*controller:comments,action:create*/
  ↳ app/models/short_id.rb:40:in `valid?'
   (0.6ms)  INSERT INTO keystores (`key`, `value`) VALUES ('thread_id', 1) ON DUPLICATE KEY UPDATE `value` = `value` + 1 /*controller:comments,action:create*/
  ↳ app/models/keystore.rb:50:in `block in incremented_value_for'
  CACHE Keystore Pluck (0.0ms)  SELECT `keystores`.`value` FROM `keystores` WHERE `keystores`.`key` = 'thread_id' LIMIT 1
  ↳ app/models/keystore.rb:15:in `value_for'
  Comment Create (4.5ms)  INSERT INTO `comments` (`created_at`, `updated_at`, `short_id`, `story_id`, `confidence_order`, `user_id`, `thread_id`, `comment`, `confidence`, `markeddown_comment`) VALUES ('2023-10-02 23:33:14', '2023-10-02 23:33:14', 'w03bow', 98554, x'000000', 78, 524562, 'asdfasdf', 0.1828847834138887, '<p>asdfasdf</p>\n') /*controller:comments,action:create*/
  ↳ app/controllers/comments_controller.rb:54:in `block in create'
  Vote Load (2.7ms)  SELECT `votes`.* FROM `votes` WHERE `votes`.`user_id` = 78 AND `votes`.`story_id` = 98554 AND `votes`.`comment_id` = 450516 ORDER BY `votes`.`id` ASC LIMIT 1 /*controller:comments,action:create*/
  ↳ app/models/vote.rb:99:in `vote_thusly_on_story_or_comment_for_user_because'
  User Load (0.3ms)  SELECT `users`.* FROM `users` WHERE `users`.`id` = 78 LIMIT 1 /*controller:comments,action:create*/
  ↳ app/models/vote.rb:120:in `vote_thusly_on_story_or_comment_for_user_because'
  Story Load (0.3ms)  SELECT `stories`.* FROM `stories` WHERE `stories`.`id` = 98554 LIMIT 1 /*controller:comments,action:create*/
  ↳ app/models/vote.rb:120:in `vote_thusly_on_story_or_comment_for_user_because'
  Vote Create (0.3ms)  INSERT INTO `votes` (`user_id`, `story_id`, `comment_id`, `vote`, `updated_at`) VALUES (78, 98554, 450516, 1, '2023-10-02 23:33:14') /*controller:comments,action:create*/
  ↳ app/models/vote.rb:120:in `vote_thusly_on_story_or_comment_for_user_because'
  Comment Load (0.4ms)  SELECT `comments`.* FROM `comments` INNER JOIN `stories` `story` ON `story`.`id` = `comments`.`story_id` WHERE (`story`.`merged_story_id` = 98554 OR `comments`.`story_id` = 98554) /*controller:comments,action:create*/
  ↳ app/models/story.rb:819:in `update_comments_count!'
  Story Update (0.3ms)  UPDATE `stories` SET `stories`.`comments_count` = 9 WHERE `stories`.`id` = 98554 /*controller:comments,action:create*/
  ↳ app/models/story.rb:819:in `update_comments_count!'
  Tag Sum (0.3ms)  SELECT SUM(`tags`.`hotness_mod`) FROM `tags` INNER JOIN `taggings` ON `tags`.`id` = `taggings`.`tag_id` WHERE `taggings`.`story_id` = 98554 /*controller:comments,action:create*/
  ↳ app/models/story.rb:341:in `calculated_hotness'
  Comment Sum (0.6ms)  SELECT SUM(comments.score + 1) FROM `comments` INNER JOIN `stories` `story` ON `story`.`id` = `comments`.`story_id` WHERE (`story`.`merged_story_id` = 98554 OR `comments`.`story_id` = 98554) AND `comments`.`user_id` != 16063 /*controller:comments,action:create*/
  ↳ app/models/story.rb:345:in `calculated_hotness'
  Story Load (0.3ms)  SELECT `stories`.* FROM `stories` WHERE `stories`.`merged_story_id` = 98554 /*controller:comments,action:create*/
  ↳ app/models/story.rb:348:in `map'
  Story Update (0.9ms)  UPDATE `stories` SET `stories`.`hotness` = -21402.4652598 WHERE `stories`.`id` = 98554 /*controller:comments,action:create*/
  ↳ app/models/story.rb:624:in `recalculate_hotness!'
   (0.5ms)  INSERT INTO keystores (`key`, `value`) VALUES ('user:78:comments_posted', 1) ON DUPLICATE KEY UPDATE `value` = `value` + 1 /*controller:comments,action:create*/
  ↳ app/models/keystore.rb:50:in `block in incremented_value_for'
  Keystore Pluck (0.3ms)  SELECT `keystores`.`value` FROM `keystores` WHERE `keystores`.`key` = 'user:78:comments_posted' LIMIT 1 /*controller:comments,action:create*/
  ↳ app/models/keystore.rb:15:in `value_for'
   (0.5ms)  UPDATE comments SET
  score = (select coalesce(sum(vote), 0) from votes where comment_id = comments.id),
  flags = (select count(*) from votes where comment_id = comments.id and vote = -1),
  confidence = 0.18288478341388867,
  confidence_order = concat(lpad(char(65536 - floor(((confidence - -0.2) * 65535) / 1.2) using binary), 2, '0'), char(id & 0xff using binary))
WHERE id = 450516
 /*controller:comments,action:create*/
  ↳ app/models/comment.rb:366:in `update_score_and_recalculate!'
  Tag Sum (0.3ms)  SELECT SUM(`tags`.`hotness_mod`) FROM `tags` INNER JOIN `taggings` ON `tags`.`id` = `taggings`.`tag_id` WHERE `taggings`.`story_id` = 98554 /*controller:comments,action:create*/
  ↳ app/models/story.rb:341:in `calculated_hotness'
  Comment Sum (0.3ms)  SELECT SUM(comments.score + 1) FROM `comments` INNER JOIN `stories` `story` ON `story`.`id` = `comments`.`story_id` WHERE (`story`.`merged_story_id` = 98554 OR `comments`.`story_id` = 98554) AND `comments`.`user_id` != 16063 /*controller:comments,action:create*/
  ↳ app/models/story.rb:345:in `calculated_hotness'
  Story Update (0.2ms)  UPDATE `stories` SET `stories`.`hotness` = -21402.4652598 WHERE `stories`.`id` = 98554 /*controller:comments,action:create*/
  ↳ app/models/story.rb:624:in `recalculate_hotness!'
  TRANSACTION (10.5ms)  COMMIT /*controller:comments,action:create*/
  ↳ app/controllers/comments_controller.rb:54:in `create'
  Rendered comments/_comment.html.erb (Duration: 1.5ms | Allocations: 2366)
  Rendered comments/_postedreply.html.erb (Duration: 1.7ms | Allocations: 2599)
Completed 200 OK in 56ms (Views: 2.2ms | ActiveRecord: 38.2ms | Allocations: 32424)

So the deadlock is happening because Comment creation (and upvoting, or Story creation) are locking the same tables in different orders. Probably the right fix is to remove those transactions. I really don't want to try to reorganize that rat's nest of callbacks to hit tables in the same order because there's no way to enforce it, we'd surely reintroduce the deadlock before long.

hmadison commented 11 months ago

My understanding is that we needed the transactions to ensure that all of the calculations worked properly.