Open MonarchNing opened 3 weeks ago
"The current transaction has been rolled back because of a deadlock or timeout. Reason code "2"." -- you're almost certainly hitting a timeout, not a deadlock. If there was a possible deadlock there would be lots of users reporting it (since there are millions of apps using Quartz).
If you've already added indexes, I'm not sure what else you can do to speed up your DB. Sounds like it's pretty overwhelmed. How many nodes in your cluster?
Thanks a lot for your comments, there are just 2 nodes hot standby! yes ,maybe it is a timeout issue, no user report it , but when in 23:00 pm , some jobs running at this time and the issue arise, there will be db connection increase, user report the function running slowly, Usually this deadlock or time out running on the hour(when job start to run), even every 20 minutes or 30 minutes the deadlock or time out issue arise, in the quartz.propertie file, there is no configuration about txIsolationLevelSerializable or acquireTriggersWithinLock,
so I set org.quartz.jobStore.txIsolationLevelSerializable=true or org.quartz.jobStore.acquireTriggersWithinLock=true can help this? thanks for your help. By the way, this issue caused by our quartz upgrade from 2.0.2 to 2.3.2, maybe there need some new configration in quartz.propertie file? I do not know what cause this issue.
"The current transaction has been rolled back because of a deadlock or timeout. Reason code "2"." -- you're almost certainly hitting a timeout, not a deadlock. If there was a possible deadlock there would be lots of users reporting it (since there are millions of apps using Quartz).
If you've already added indexes, I'm not sure what else you can do to speed up your DB. Sounds like it's pretty overwhelmed. How many nodes in your cluster?
We've the quartz scheduler running production in cluster mode and noticed the row level lock acquired on TRIGGERS table is causing deadlock.
Setup
DB2v11.1 Java8 Quartz 2.3.2
Exception: [10/26/24 23:10:05:946 HKT] 0000010d SystemOut O 2024-10-26 23:10:05 ERROR [org.quartz.core.ErrorLogger]: An error occurred while scanning for the next triggers to fire. org.quartz.JobPersistenceException: Couldn't acquire next trigger: The current transaction has been rolled back because of a deadlock or timeout. Reason code "2".. SQLCODE=-911, SQLSTATE=40001, DRIVER=4.25.1301 [See nested exception: com.ibm.db2.jcc.am.SqlTransactionRollbackException: The current transaction has been rolled back because of a deadlock or timeout. Reason code "2".. SQLCODE=-911, SQLSTATE=40001, DRIVER=4.25.1301] at org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTrigger(JobStoreSupport.java:2923) at org.quartz.impl.jdbcjobstore.JobStoreSupport$41.execute(JobStoreSupport.java:2805) at org.quartz.impl.jdbcjobstore.JobStoreSupport$41.execute(JobStoreSupport.java:2803) at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3864) at org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTriggers(JobStoreSupport.java:2802) at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:287) at hk.com.mtrc.etms.batch.scheduler.DelegatingWork.run(WorkManagerThreadExecutor.java:83) at com.ibm.ws.asynchbeans.J2EEContext$RunProxy.run(J2EEContext.java:277) at java.security.AccessController.doPrivileged(AccessController.java:716) at javax.security.auth.Subject.doAs(Subject.java:490) at com.ibm.websphere.security.auth.WSSubject.doAs(WSSubject.java:133) at com.ibm.websphere.security.auth.WSSubject.doAs(WSSubject.java:91) at com.ibm.ws.asynchbeans.J2EEContext$DoAsProxy.run(J2EEContext.java:348) at java.security.AccessController.doPrivileged(AccessController.java:746) at com.ibm.ws.asynchbeans.J2EEContext.run(J2EEContext.java:1042) at com.ibm.ws.asynchbeans.WorkWithExecutionContextImpl.go(WorkWithExecutionContextImpl.java:199) at com.ibm.ws.asynchbeans.CJWorkItemImpl.run(CJWorkItemImpl.java:237) at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1909) Caused by: com.ibm.db2.jcc.am.SqlTransactionRollbackException: The current transaction has been rolled back because of a deadlock or timeout. Reason code "2".. SQLCODE=-911, SQLSTATE=40001, DRIVER=4.25.1301 at com.ibm.db2.jcc.am.b6.a(b6.java:797) at com.ibm.db2.jcc.am.b6.a(b6.java:66) at com.ibm.db2.jcc.am.b6.a(b6.java:140) at com.ibm.db2.jcc.am.k3.c(k3.java:2824) at com.ibm.db2.jcc.t4.ab.x(ab.java:1827) at com.ibm.db2.jcc.t4.ab.n(ab.java:950) at com.ibm.db2.jcc.t4.ab.a(ab.java:120) at com.ibm.db2.jcc.t4.p.a(p.java:50) at com.ibm.db2.jcc.t4.aw.b(aw.java:220) at com.ibm.db2.jcc.am.k4.bm(k4.java:3599) at com.ibm.db2.jcc.am.k4.a(k4.java:4644) at com.ibm.db2.jcc.am.k4.b(k4.java:4182) at com.ibm.db2.jcc.am.k4.be(k4.java:827) at com.ibm.db2.jcc.am.k4.executeUpdate(k4.java:801) at com.ibm.ws.rsadapter.jdbc.WSJdbcPreparedStatement.pmiExecuteUpdate(WSJdbcPreparedStatement.java:1304) at com.ibm.ws.rsadapter.jdbc.WSJdbcPreparedStatement.executeUpdate(WSJdbcPreparedStatement.java:845) at org.quartz.impl.jdbcjobstore.StdJDBCDelegate.updateTriggerStateFromOtherState(StdJDBCDelegate.java:1439) at org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTrigger(JobStoreSupport.java:2901) ... 17 more According to the updateTriggerStateFromOtherState function, Ifound the update sql script and add index on QRTZ_TRIGGERS table (SCHED_NAME, TRIGGER_NAME, TRIGGER_GROUP, TRIGGER_STATE) but this issue can not be fixed.