Open akshaylad26 opened 2 months ago
@akshaylad26 Is this happening in a Batch Task application that is launched by Spring Cloud Data Flow or using a Schedule?
@akshaylad26 Can you provide more information on your cleanup job?
@akshaylad26 Please provide the version of MySql and the platform you are launching the applications (Kubernetes, local, Cloud Foundry). Thanks!
We are launching using Kubernetes & MySQL version is 8.0.33 Both cleanup job & batch task application are scheduled operations. Cleanup job executes every 6 hours whereas task is scheduled every hour. In Cleanup job we get executions using /tasks/executions endpoint in a paginated way sorted by task execution id in ascending order. Then we use /task/executions/{list of executionIds}?action=REMOVE_DATA,CLEANUP&schemaTarget=boot3 endpoint for cleaning up older executions. We follow this way because normal cleanup API provided by SCDF doesn't clean executions without any start/end date.
Let me know if this helps.
What transaction isolation level are you using for your batch application(s)?
We use default isolation level. We use hikari datasource & dont override it.
You may wish to consider setting the transaction isolation level of your apps to something like ISOLATION_REPEATABLE_READ
as discussed in Spring Batch's Javadoc.
sure will give it a try. Thanks!
Creating new transaction with name [org.springframework.cloud.dataflow.server.service.LauncherInitializationService.initialize]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT
I see this line in SCDF server DEBUG logs as well. Does this also need to be updated to ISOLATION_REPEATABLE_READ?
Thank you for your update! For now SCDF needs to keep its current transaction isolation level.
I just checked in our Database, its default isolation level itself is REPEATABLE_READ.
Understood. Not sure that necessarily affects the Batch apps in that it sets its own default.
Hi @cppwfs ,
We tried adding transaction isolation level = ISOLATION_REPEATABLE_READ in our batch application. But today again got the same deadlock issue.
Can you share a small sample application that reproduces the issue? Thanks
Hi @cppwfs ,
We were not able to reproduce issue. However, attaching innodb status file from mysql which shows details about deadlock. deadlock.txt
Hope it helps to investigate further.
DELETE FROM BOOT3_BATCH_JOB_INSTANCE WHERE NOT EXISTS ( SELECT JOB_INSTANCE_ID FROM BOOT3_BATCH_JOB_EXECUTION WHERE BOOT3_BATCH_JOB_INSTANCE.JOB_INSTANCE_ID = JOB_INSTANCE_ID)
I believe this query holds the lock of entire table rows which causes the problem.
Description: We are getting intermittent deadlock issue while executing tasks. We observed that this usually happens when our execution cleanup job is running in parallel. Below is the stacktrace. We are unable to reproduce this issue in lower environments, but it is frequently happening in client environments. Can you please check? If any details required, please let me know, I will extract it.
Release versions: We are using 2.11.4 release version.
org.springframework.dao.CannotAcquireLockException: PreparedStatementCallback; SQL [INSERT INTO BOOT3_BATCH_JOB_EXECUTION(JOB_EXECUTION_ID, JOB_INSTANCE_ID, START_TIME, END_TIME, STATUS, EXIT_CODE, EXIT_MESSAGE, VERSION, CREATE_TIME, LAST_UPDATED) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?) ]; Deadlock found when trying to get lock; try restarting transaction at org.springframework.jdbc.support.SQLExceptionSubclassTranslator.doTranslate(SQLExceptionSubclassTranslator.java:78) at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73) at org.springframework.jdbc.core.JdbcTemplate.translateException(JdbcTemplate.java:1572) at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:667) at org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:960) at org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:1015) at org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:1020) at org.springframework.batch.core.repository.dao.JdbcJobExecutionDao.saveJobExecution(JdbcJobExecutionDao.java:243) at org.springframework.batch.core.repository.support.SimpleJobRepository.createJobExecution(SimpleJobRepository.java:178) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:343) at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:196) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:123) at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:391) at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:119) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:184) at org.springframework.batch.core.repository.support.AbstractJobRepositoryFactoryBean.lambda$getObject$0(AbstractJobRepositoryFactoryBean.java:204) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:184) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:244) at jdk.proxy2/jdk.proxy2.$Proxy94.createJobExecution(Unknown Source) at org.springframework.batch.core.launch.support.SimpleJobLauncher.run(SimpleJobLauncher.java:145) at org.springframework.batch.core.launch.support.TaskExecutorJobLauncher.run(TaskExecutorJobLauncher.java:59) at org.springframework.cloud.task.batch.handler.TaskJobLauncherApplicationRunner.execute(TaskJobLauncherApplicationRunner.java:147) at org.springframework.boot.autoconfigure.batch.JobLauncherApplicationRunner.executeLocalJobs(JobLauncherApplicationRunner.java:195) at org.springframework.boot.autoconfigure.batch.JobLauncherApplicationRunner.launchJobFromProperties(JobLauncherApplicationRunner.java:175) at org.springframework.cloud.task.batch.handler.TaskJobLauncherApplicationRunner.run(TaskJobLauncherApplicationRunner.java:112) at org.springframework.boot.autoconfigure.batch.JobLauncherApplicationRunner.run(JobLauncherApplicationRunner.java:165) at org.springframework.boot.SpringApplication.lambda$callRunner$4(SpringApplication.java:771) at org.springframework.util.function.ThrowingConsumer$1.acceptWithException(ThrowingConsumer.java:83) at org.springframework.util.function.ThrowingConsumer.accept(ThrowingConsumer.java:60) at org.springframework.util.function.ThrowingConsumer$1.accept(ThrowingConsumer.java:88) at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:783) at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:771) at org.springframework.boot.SpringApplication.lambda$callRunners$3(SpringApplication.java:759) at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) at java.base/java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:357) at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:510) at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:759) at org.springframework.boot.SpringApplication.run(SpringApplication.java:332) at org.springframework.boot.SpringApplication.run(SpringApplication.java:1327) at org.springframework.boot.SpringApplication.run(SpringApplication.java:1316) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:49) at org.springframework.boot.loader.Launcher.launch(Launcher.java:95) at org.springframework.boot.loader.Launcher.launch(Launcher.java:58) at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:65) Caused by: com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:124) at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122) at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:916) at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1061) at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1009) at com.mysql.cj.jdbc.ClientPreparedStatement.executeLargeUpdate(ClientPreparedStatement.java:1320) at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdate(ClientPreparedStatement.java:994) at com.zaxxer.hikari.pool.ProxyPreparedStatement.executeUpdate(ProxyPreparedStatement.java:61) at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeUpdate(HikariProxyPreparedStatement.java) at org.springframework.jdbc.core.JdbcTemplate.lambda$update$2(JdbcTemplate.java:965) at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:648) ... 55 common frames omitted