Open daniel-goldstein opened 6 months ago
From triage discussion.
What this means: Instance destruction is slower than it needs to be, this can impact throughput but does not really impact reliability, and is uncommon enough to not be high priority at this time.
What happened?
Most stored procedures take either a shared or exclusive lock on a relevant row of the
jobs
table near the start of the procedure, but not all. This appears to interact poorly with theattempts_after_update
trigger as it attempts to take an exclusive lock on rows in thejobs
table in the below join with the attempt resources tables. It's not clear exactly what the right fix is. It should be simple enough not to join on the jobs table in theFOR UPDATE
, but we should also evaluate when in our various transactions a lock should be taken on the jobs table and whether it should be an X or S lock.Version
0.2.128
Relevant log output