Closed Hakunata closed 2 years ago
建议查看一下日志,看看prepare这块有没有异常的捕获。
感谢及时回复~
@Around("execution(@org.apache.servicecomb.pack.omega.transaction.annotations.Participate * *(..)) && @annotation(participate)")
Object advise(ProceedingJoinPoint joinPoint, Participate participate) throws Throwable {
Method method = ((MethodSignature)joinPoint.getSignature()).getMethod();
TransactionContext transactionContext = this.extractTransactionContext(joinPoint.getArgs());
if (transactionContext != null) {
this.populateOmegaContext(this.context, transactionContext);
}
String localTxId = this.context.localTxId();
String cancelMethod = this.callbackMethodSignature(joinPoint, participate.cancelMethod(), method);
String confirmMethod = this.callbackMethodSignature(joinPoint, participate.confirmMethod(), method);
this.context.newLocalTxId();
LOG.debug("Updated context {} for participate method {} ", this.context, method.toString());
Object var10;
try {
AlphaResponse response = this.tccMessageSender.participationStart(new ParticipationStartedEvent(this.context.globalTxId(), this.context.localTxId(), localTxId, confirmMethod, cancelMethod));
if (response.aborted()) {
throw new OmegaException("transcation has aborted: " + this.context.globalTxId());
}
Object result = joinPoint.proceed();
this.tccMessageSender.participationEnd(new ParticipationEndedEvent(this.context.globalTxId(), this.context.localTxId(), localTxId, confirmMethod, cancelMethod, TransactionStatus.Succeed));
this.parametersContext.putParameters(this.context.localTxId(), joinPoint.getArgs());
LOG.debug("Participate Transaction with context {} has finished.", this.context);
var10 = result;
} catch (Throwable var14) {
if (!(var14 instanceof OmegaException)) {
// **********************************************
// DEBUG断点到这里,异常是捕获到了的
// **********************************************
this.tccMessageSender.participationEnd(new ParticipationEndedEvent(this.context.globalTxId(), this.context.localTxId(), localTxId, confirmMethod, cancelMethod, TransactionStatus.Failed));
}
LOG.error("Participate Transaction with context {} failed.", this.context, var14);
throw var14;
} finally {
this.context.setLocalTxId(localTxId);
}
return var10;
}```
进入TccParticipatorAspect 类打断点,看到是捕获到异常了。目前POC ServiceComb-Pack,帮忙看看,目前没来得及读完框架代码
建议查看一下日志,看看prepare这块有没有异常的捕获。
我的服务客户端(Omega)与Alpha都是部署在同一台机器上,这个对事件分发不会有冲突吧
建议查看一下日志,看看prepare这块有没有异常的捕获。
org.springframework.transaction.TransactionSystemException: Could not commit JPA transaction; nested exception is javax.persistence.RollbackException: Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.7.1.v20171221-bd47e8f): org.eclipse.persistence.exceptions.DatabaseException
Internal Exception: java.sql.SQLIntegrityConstraintViolationException: Duplicate entry 'be2bdf3f-f421-4292-9234-ef2d57e7bfa4-be2bdf3f-f421-4292-9234-ef2' for key 'tcc_global_tx_event_index'
Error Code: 1062
Call: INSERT INTO tcc_global_tx_event (CREATIONTIME, GLOBALTXID, INSTANCEID, LASTMODIFIED, LOCALTXID, PARENTTXID, SERVICENAME, STATUS, TXTYPE) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
bind => [9 parameters bound]
Query: InsertObjectQuery(GlobalTxEvent{surrogateId=null, globalTxId='be2bdf3f-f421-4292-9234-ef2d57e7bfa4', localTxId='be2bdf3f-f421-4292-9234-ef2d57e7bfa4', parentTxId='', serviceName='bussiness', instanceId='bussiness-127.0.0.1', txType='END_TIMEOUT', status='Failed', creationTime=Thu Feb 24 09:18:14 CST 2022, lastModified=Thu Feb 24 09:18:14 CST 2022})
at org.springframework.orm.jpa.JpaTransactionManager.doCommit(JpaTransactionManager.java:541)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:746)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:714)
at org.springframework.transaction.interceptor.TransactionAspectSupport.commitTransactionAfterReturning(TransactionAspectSupport.java:534)
at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:305)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:98)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:688)
at org.apache.servicecomb.pack.alpha.server.tcc.service.RDBTxEventRepository$$EnhancerBySpringCGLIB$$7b4be68d.saveGlobalTxEvent(<generated>)
at org.apache.servicecomb.pack.alpha.server.tcc.service.TccTxEventService.onTccEndedEvent(TccTxEventService.java:108)
at org.apache.servicecomb.pack.alpha.server.tcc.service.TccTxEventService.lambda$null$0(TccTxEventService.java:144)
at java.util.Vector.forEach(Vector.java:1275)
at org.apache.servicecomb.pack.alpha.server.tcc.service.TccTxEventService.lambda$handleTimeoutTx$1(TccTxEventService.java:135)
at java.util.Optional.ifPresent(Optional.java:159)
at org.apache.servicecomb.pack.alpha.server.tcc.service.TccTxEventService.handleTimeoutTx(TccTxEventService.java:135)
at org.apache.servicecomb.pack.alpha.server.tcc.service.TccEventScanner.lambda$start$0(TccEventScanner.java:50)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: javax.persistence.RollbackException: Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.7.1.v20171221-bd47e8f): org.eclipse.persistence.exceptions.DatabaseException
放置一会不操作,还会出现这种数据库错误
我的服务客户端(Omega)与Alpha都是部署在同一台机器上,这个对事件分发不会有冲突吧
在同一台机器上是不会受影响的。
因为我们用了spring的@transaction action, 导致TCC准备的时候如果出现问题之后,发生数据库事件回滚。 需要检查一下Alpha端, 看看 TCC prepared的事件是否接收到了。
因为我们用了spring的@transaction action, 导致TCC准备的时候如果出现问题之后,发生数据库事件回滚。 需要检查一下Alpha端, 看看 TCC prepared的事件是否接收到了。
localTxId parentTxId serviceName instanceId methodInfo txType status creationTime 54c3cc43-e59b-4657-9117-3be4c6b9618c bussiness bussiness-127.0.0.1 STARTED Succeed 2022-02-23 04:32:51 eb887004-daeb-44a7-870d-bb1c9c0df6e6 54c3cc43-e59b-4657-9117-3be4c6b9618c bank_alpha bank_alpha-127.0.0.1 confirm=public void cn.huolala.bme.alpha.service.UserAccountService.confirmOut(cn.huolala.bme.alpha.controller.bo.TransferEntity),cancel=public void cn.huolala.bme.alpha.service.UserAccountService.cancelOut(cn.huolala.bme.alpha.controller.bo.TransferEntity) P_TX_STATED 2022-02-23 04:32:51 eb887004-daeb-44a7-870d-bb1c9c0df6e6 54c3cc43-e59b-4657-9117-3be4c6b9618c bank_alpha bank_alpha-127.0.0.1 confirm=public void cn.huolala.bme.alpha.service.UserAccountService.confirmOut(cn.huolala.bme.alpha.controller.bo.TransferEntity),cancel=public void cn.huolala.bme.alpha.service.UserAccountService.cancelOut(cn.huolala.bme.alpha.controller.bo.TransferEntity) P_TX_ENDED Succeed 2022-02-23 04:32:53 c88bd40b-87cb-41ac-afb5-9fdb5f46301c 54c3cc43-e59b-4657-9117-3be4c6b9618c bank_beta bank_beta-127.0.0.1 confirm=public void cn.huolala.bme.beta.service.UserAccountService.confirmIn(cn.huolala.bme.beta.controller.bo.TransferEntity),cancel=public void cn.huolala.bme.beta.service.UserAccountService.cancelIn(cn.huolala.bme.beta.controller.bo.TransferEntity) P_TX_STATED 2022-02-23 04:32:54 c88bd40b-87cb-41ac-afb5-9fdb5f46301c 54c3cc43-e59b-4657-9117-3be4c6b9618c bank_beta bank_beta-127.0.0.1 confirm=public void cn.huolala.bme.beta.service.UserAccountService.confirmIn(cn.huolala.bme.beta.controller.bo.TransferEntity),cancel=public void cn.huolala.bme.beta.service.UserAccountService.cancelIn(cn.huolala.bme.beta.controller.bo.TransferEntity) P_TX_ENDED Failed 2022-02-23 04:32:55 54c3cc43-e59b-4657-9117-3be4c6b9618c bussiness bussiness-127.0.0.1 ENDED Succeed 2022-02-23 04:32:55 eb887004-daeb-44a7-870d-bb1c9c0df6e6 54c3cc43-e59b-4657-9117-3be4c6b9618c bank_alpha bank_alpha-127.0.0.1 public void cn.huolala.bme.alpha.service.UserAccountService.confirmOut(cn.huolala.bme.alpha.controller.bo.TransferEntity) COORDINATED Succeed 2022-02-23 04:32:59
如果是落库的事件日志,是没看到的,不过你给了一个思路,我再看一下;你可以给一个分析的具体位置:比如看哪个类或哪个表或Alpha的关键日志Label
因为我们用了spring的@transaction action, 导致TCC准备的时候如果出现问题之后,发生数据库事件回滚。 需要检查一下Alpha端, 看看 TCC prepared的事件是否接收到了。
出现这种问题,需要怎么处理?
还有就是可以参考TCC的验收测试, https://github.com/apache/servicecomb-pack/tree/master/acceptance-tests/acceptance-pack-tcc-spring-demo/src/test ,把你遇到的场景写一下。 之前的验收测试覆盖的场景有限, 没有cover 所有的场景。
还有就是可以参考TCC的验收测试, https://github.com/apache/servicecomb-pack/tree/master/acceptance-tests/acceptance-pack-tcc-spring-demo/src/test ,把你遇到的场景写一下。 之前的验收测试覆盖的场景有限, 没有cover 所有的场景。
场景很简单,Alpha服务转出额度,prepare(transferOut方法)设置冻结额度,confirm方法扣除冻结额度,同时余额增加(冻结)额度,转账完成;或者异常终止退出。 这个是应公司交易业务要求POC分布式事务框架,我看pack比较简单,所以选择了它,我这边需要给业务答复哈,可能把案例邮件给你帮忙看一看吗?
TCC 事件 start的时候出现异常吗?
因为我们用了spring的@transaction action, 导致TCC准备的时候如果出现问题之后,发生数据库事件回滚。 需要检查一下Alpha端, 看看 TCC prepared的事件是否接收到了。
localTxId parentTxId serviceName instanceId methodInfo txType status creationTime 54c3cc43-e59b-4657-9117-3be4c6b9618c bussiness bussiness-127.0.0.1 STARTED Succeed 2022-02-23 04:32:51 eb887004-daeb-44a7-870d-bb1c9c0df6e6 54c3cc43-e59b-4657-9117-3be4c6b9618c bank_alpha bank_alpha-127.0.0.1 confirm=public void cn.huolala.bme.alpha.service.UserAccountService.confirmOut(cn.huolala.bme.alpha.controller.bo.TransferEntity),cancel=public void cn.huolala.bme.alpha.service.UserAccountService.cancelOut(cn.huolala.bme.alpha.controller.bo.TransferEntity) P_TX_STATED 2022-02-23 04:32:51 eb887004-daeb-44a7-870d-bb1c9c0df6e6 54c3cc43-e59b-4657-9117-3be4c6b9618c bank_alpha bank_alpha-127.0.0.1 confirm=public void cn.huolala.bme.alpha.service.UserAccountService.confirmOut(cn.huolala.bme.alpha.controller.bo.TransferEntity),cancel=public void cn.huolala.bme.alpha.service.UserAccountService.cancelOut(cn.huolala.bme.alpha.controller.bo.TransferEntity) P_TX_ENDED Succeed 2022-02-23 04:32:53 c88bd40b-87cb-41ac-afb5-9fdb5f46301c 54c3cc43-e59b-4657-9117-3be4c6b9618c bank_beta bank_beta-127.0.0.1 confirm=public void cn.huolala.bme.beta.service.UserAccountService.confirmIn(cn.huolala.bme.beta.controller.bo.TransferEntity),cancel=public void cn.huolala.bme.beta.service.UserAccountService.cancelIn(cn.huolala.bme.beta.controller.bo.TransferEntity) P_TX_STATED 2022-02-23 04:32:54 c88bd40b-87cb-41ac-afb5-9fdb5f46301c 54c3cc43-e59b-4657-9117-3be4c6b9618c bank_beta bank_beta-127.0.0.1 confirm=public void cn.huolala.bme.beta.service.UserAccountService.confirmIn(cn.huolala.bme.beta.controller.bo.TransferEntity),cancel=public void cn.huolala.bme.beta.service.UserAccountService.cancelIn(cn.huolala.bme.beta.controller.bo.TransferEntity) P_TX_ENDED Failed 2022-02-23 04:32:55 54c3cc43-e59b-4657-9117-3be4c6b9618c bussiness bussiness-127.0.0.1 ENDED Succeed 2022-02-23 04:32:55 eb887004-daeb-44a7-870d-bb1c9c0df6e6 54c3cc43-e59b-4657-9117-3be4c6b9618c bank_alpha bank_alpha-127.0.0.1 public void cn.huolala.bme.alpha.service.UserAccountService.confirmOut(cn.huolala.bme.alpha.controller.bo.TransferEntity) COORDINATED Succeed 2022-02-23 04:32:59
如果是落库的事件日志,是没看到的,不过你给了一个思路,我再看一下;你可以给一个分析的具体位置:比如看哪个类或哪个表或Alpha的关键日志Label
看了一下日志, 相关事件的执行过程是OK,就是最后同步提交的时候, 应该都要调用两个子服务的cancle方法。
TCC 事件 start的时候出现异常吗?
因为我们用了spring的@transaction action, 导致TCC准备的时候如果出现问题之后,发生数据库事件回滚。 需要检查一下Alpha端, 看看 TCC prepared的事件是否接收到了。
localTxId parentTxId serviceName instanceId methodInfo txType status creationTime 54c3cc43-e59b-4657-9117-3be4c6b9618c bussiness bussiness-127.0.0.1 STARTED Succeed 2022-02-23 04:32:51 eb887004-daeb-44a7-870d-bb1c9c0df6e6 54c3cc43-e59b-4657-9117-3be4c6b9618c bank_alpha bank_alpha-127.0.0.1 confirm=public void cn.huolala.bme.alpha.service.UserAccountService.confirmOut(cn.huolala.bme.alpha.controller.bo.TransferEntity),cancel=public void cn.huolala.bme.alpha.service.UserAccountService.cancelOut(cn.huolala.bme.alpha.controller.bo.TransferEntity) P_TX_STATED 2022-02-23 04:32:51 eb887004-daeb-44a7-870d-bb1c9c0df6e6 54c3cc43-e59b-4657-9117-3be4c6b9618c bank_alpha bank_alpha-127.0.0.1 confirm=public void cn.huolala.bme.alpha.service.UserAccountService.confirmOut(cn.huolala.bme.alpha.controller.bo.TransferEntity),cancel=public void cn.huolala.bme.alpha.service.UserAccountService.cancelOut(cn.huolala.bme.alpha.controller.bo.TransferEntity) P_TX_ENDED Succeed 2022-02-23 04:32:53 c88bd40b-87cb-41ac-afb5-9fdb5f46301c 54c3cc43-e59b-4657-9117-3be4c6b9618c bank_beta bank_beta-127.0.0.1 confirm=public void cn.huolala.bme.beta.service.UserAccountService.confirmIn(cn.huolala.bme.beta.controller.bo.TransferEntity),cancel=public void cn.huolala.bme.beta.service.UserAccountService.cancelIn(cn.huolala.bme.beta.controller.bo.TransferEntity) P_TX_STATED 2022-02-23 04:32:54 c88bd40b-87cb-41ac-afb5-9fdb5f46301c 54c3cc43-e59b-4657-9117-3be4c6b9618c bank_beta bank_beta-127.0.0.1 confirm=public void cn.huolala.bme.beta.service.UserAccountService.confirmIn(cn.huolala.bme.beta.controller.bo.TransferEntity),cancel=public void cn.huolala.bme.beta.service.UserAccountService.cancelIn(cn.huolala.bme.beta.controller.bo.TransferEntity) P_TX_ENDED Failed 2022-02-23 04:32:55 54c3cc43-e59b-4657-9117-3be4c6b9618c bussiness bussiness-127.0.0.1 ENDED Succeed 2022-02-23 04:32:55 eb887004-daeb-44a7-870d-bb1c9c0df6e6 54c3cc43-e59b-4657-9117-3be4c6b9618c bank_alpha bank_alpha-127.0.0.1 public void cn.huolala.bme.alpha.service.UserAccountService.confirmOut(cn.huolala.bme.alpha.controller.bo.TransferEntity) COORDINATED Succeed 2022-02-23 04:32:59
如果是落库的事件日志,是没看到的,不过你给了一个思路,我再看一下;你可以给一个分析的具体位置:比如看哪个类或哪个表或Alpha的关键日志Label看了一下日志, 相关事件的执行过程是OK,就是最后同步提交的时候, 应该都要调用两个子服务的cancle方法。
我的理解也是这样的,转入方服务的prepare会检查账户是否存在,不存在就抛出异常;然后Alpha要给两个服务下发cancel调用事件,但是没有下发,debug的情况是转出方的confirm方法被调用了。
目前我打算直接在IJ里面导入Alpha服务代码,本地跑调试看看能不能定位到,不行就只能放弃换阿里的Seata方案试试了
你可以debug一下Alpha的代码, Alpha 调用Omega agent confirm 或者 cancel的代码在 org.apache.servicecomb.pack.alpha.server.tcc.callback.GrpcOmegaTccCallback 的 invoke 方法里面。 里面会比较Transaction的状态, 我估计可能是Transaction的状态出了问题。
你可以debug一下Alpha的代码, Alpha 调用Omega agent confirm 或者 cancel的代码在 org.apache.servicecomb.pack.alpha.server.tcc.callback.GrpcOmegaTccCallback 的 invoke 方法里面。 里面会比较Transaction的状态, 我估计可能是Transaction的状态出了问题。
debug模式alpha服务已经跑起来了。。。
你可以debug一下Alpha的代码, Alpha 调用Omega agent confirm 或者 cancel的代码在 org.apache.servicecomb.pack.alpha.server.tcc.callback.GrpcOmegaTccCallback 的 invoke 方法里面。 里面会比较Transaction的状态, 我估计可能是Transaction的状态出了问题。
发现问题了(A:转出服务;B:转入服务):
因为父事物里面我catch了B服务的异常,没有在父事物的方法(@TccStart修饰的业务方法)里面再抛出,结果父事物的事物类型是Success
@TccStart public Result transfer(TransferEntity transferEntity){ String errMsg = ""; Integer errCode = 0; try{ ResponseEntity<UserAccount> outResp = alphaClient.getForEntity(getBankAlphaURL(transferEntity),UserAccount.class,transferEntity.getFromId(),transferEntity.getMoney() ); System.out.println(outResp.getBody().toString()); ResponseEntity<UserAccount> inResp = betaClient.getForEntity(getBankBetaURL(transferEntity),UserAccount.class,transferEntity.getToId(),transferEntity.getMoney() ); System.out.println(inResp.getBody().toString()); errMsg = "user[id:"+transferEntity.getFromId()+"] transfer meney("+transferEntity.getMoney()+") to user[id:"+transferEntity.getToId()+"] success"; }catch (Exception e){ e.printStackTrace(); errMsg = e.getMessage(); errCode = -1; // *********************************************** // 这里要继续抛出去,或者确认子事物会throw失败异常,父事物就不应该try...catch // *********************************************** throw e; } return new Result(errCode,errMsg); }
I think it should mention this information in the documentation. You have to throw Exception to trigger the TCC or Saga rollback.
I think it should mention this information in the documentation. You have to throw Exception to trigger the TCC or Saga rollback.
@Hakunata 可以提一个PR 来完善Java doc,Do you mind submit a PR for the document enhancement? 另外如果子事务失败了,我们也需要重新设置父事务的状态。We need to setup the transaction status by looking up the transaction of subtasks.
I think it should mention this information in the documentation. You have to throw Exception to trigger the TCC or Saga rollback.
@Hakunata 可以提一个PR 来完善Java doc,Do you mind submit a PR for the document enhancement? 另外如果子事务失败了,我们也需要重新设置父事务的状态。We need to setup the transaction status by looking up the transaction of subtasks.
OK
另外: 1,Alpha的负载均衡介绍文档好像也没看见; 2,集群的性能指标(QPS)数据好像也没看见; 3,类似Seata,Alpha可以考虑接入Redis做数据库?因为AKKA感觉太重了;
Alpha的集群是靠服务发现实现的, 具体的负载均衡可以在服务发现那块来配置。 集群的性能指标需要结合你们的业务情况进行评估。 我们现在在讨论把Alpha的基础功能进行拆分的事情, 现在TCC 有两个实现一个是放在内存里面, 一个依托与数据库的。 如果采用Redis,需要你自己来做相应的扩展。
well, It looks like we need to setup a meeting to discuss the roadmap ?
父事物逻辑: `@TccStart public Result transfer(TransferEntity transferEntity){ String errMsg = ""; Integer errCode = 0; try{ ResponseEntity outResp = alphaClient.getForEntity(getBankAlphaURL(transferEntity),UserAccount.class,transferEntity.getFromId(),transferEntity.getMoney() );
System.out.println(outResp.getBody().toString());
ResponseEntity inResp = betaClient.getForEntity(getBankBetaURL(transferEntity),UserAccount.class,transferEntity.getToId(),transferEntity.getMoney() );
System.out.println(inResp.getBody().toString());
errMsg = "user[id:"+transferEntity.getFromId()+"] transfer meney("+transferEntity.getMoney()+") to user[id:"+transferEntity.getToId()+"] success";
}catch (Exception e){
e.printStackTrace();
errMsg = e.getMessage();
errCode = -1;
}
return new Result(errCode,errMsg);
}
子事务逻辑(服务:bank_alpha): `
子事务逻辑(服务:bank_beta): `
开始以为是我自己编译alpha版本的问题,结果用apache的二进制包还是一样;