alibaba / transmittable-thread-local

📌 a missing Java std lib(simple & 0-dependency) for framework/middleware, provide an enhanced InheritableThreadLocal that transmits values between threads even using thread pooling components.
https://github.com/alibaba/transmittable-thread-local
Apache License 2.0
7.35k stars 1.67k forks source link

在transmittable-thread-local-2.12.4.jar agent 模式下与 nacos spring cloud loadbalance 偶发并发问题 #648

Open KIDOSTAS opened 1 month ago

KIDOSTAS commented 1 month ago

异常如下

[2024-05-20 16:20:35.293] ERROR tcsl (DiscoveryClientServiceInstanceListSupplier.java:119) SCM-BILL-SERVER 192.168.27.6 8686 1 - - [b1a2ad3b771f4d47b56d9ced1c22ecf8][0]Exception occurred while retrieving instances for service SCM-ARCHIVE-SERVERjava.lang.IllegalStateException: §2.12 violated: onSubscribe must be called at most once
    at reactor.core.publisher.StrictSubscriber.onSubscribe(StrictSubscriber.java:82)
    at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onSubscribe(FluxOnErrorResume.java:66)
    at reactor.core.publisher.Operators.reportThrowInSubscribe(Operators.java:224)
    at reactor.core.publisher.Mono.subscribe(Mono.java:4255)
    at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.onSubscribe(FluxTimeout.java:148)
    at reactor.core.publisher.MonoCallable.subscribe(MonoCallable.java:49)
    at reactor.core.publisher.FluxFromMonoOperator.subscribe(FluxFromMonoOperator.java:83)
    at reactor.core.publisher.FluxDefer.subscribe(FluxDefer.java:54)
    at reactor.core.publisher.Mono.subscribe(Mono.java:4252)
    at reactor.core.publisher.FluxSwitchIfEmpty$SwitchIfEmptySubscriber.onComplete(FluxSwitchIfEmpty.java:75)
    at reactor.core.publisher.Operators.complete(Operators.java:135)
    at reactor.core.publisher.MonoEmpty.subscribe(MonoEmpty.java:45)
    at reactor.core.publisher.InternalFluxOperator.subscribe(InternalFluxOperator.java:62)
    at reactor.core.publisher.FluxDefer.subscribe(FluxDefer.java:54)
    at reactor.core.publisher.Mono.subscribe(Mono.java:4252)
    at reactor.core.publisher.FluxFlatMap.trySubscribeScalarMap(FluxFlatMap.java:203)
    at reactor.core.publisher.MonoFlatMap.subscribeOrReturn(MonoFlatMap.java:53)
    at reactor.core.publisher.Mono.subscribe(Mono.java:4237)
    at reactor.core.publisher.Mono.block(Mono.java:1684)
    at org.springframework.cloud.loadbalancer.blocking.client.BlockingLoadBalancerClient.lambda$choose$0(BlockingLoadBalancerClient.java:69)
    at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
    at com.alibaba.ttl.TtlRunnable.run(TtlRunnable.java:59)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

去掉ttl agent 恢复正常

oldratlee commented 1 month ago

@KIDOSTAS 请提供一个 极简、可运行、复现问题的 代码Demo工程。

推荐提供成一个单独的工程(GitHub repo)。这样可以:


[2024-05-20 16:20:35.293] ERROR tcsl (DiscoveryClientServiceInstanceListSupplier.java:119) SCM-BILL-SERVER 192.168.27.6 8686 1 - - [b1a2ad3b771f4d47b56d9ced1c22ecf8][0]Exception occurred while retrieving instances for service SCM-ARCHIVE-SERVER java.lang.IllegalStateException: §2.12 violated: onSubscribe must be called at most once

java.lang.IllegalStateException: §2.12 violated: onSubscribe must be called at most once

对于这样的Reactive Streams规范不符合异常/问题,应该:

zavakid commented 1 month ago

+1,Demo 工程 或者 由此 Demo 工程逐步沉淀成可复用的单测能力

KIDOSTAS commented 1 month ago

@oldratlee @zavakid 写了一个简单的demo 加上ttl探针启动 5并发跑 十分钟左右 可以复现 去掉就正常了

demottl.zip

KIDOSTAS commented 1 month ago

企业微信截图_1e993fc0-07e3-46fd-802f-33b0ce7ebad1 报错截图 调用方式 localhost:8080/testCall

KIDOSTAS commented 1 month ago

@KIDOSTAS 请提供一个 极简、可运行、复现问题的 代码Demo工程。

推荐提供成一个单独的工程(GitHub repo)。这样可以:

  • 方便大家能排查分析;只提供片段代码、运行问题概述,排查信息不足
  • 方便分离不相关的业务实现内容,以及排除可能的业务使用问题 如使用配置问题、其它业务代码的意外影响

[2024-05-20 16:20:35.293] ERROR tcsl (DiscoveryClientServiceInstanceListSupplier.java:119) SCM-BILL-SERVER 192.168.27.6 8686 1 - - [b1a2ad3b771f4d47b56d9ced1c22ecf8][0]Exception occurred while retrieving instances for service SCM-ARCHIVE-SERVER java.lang.IllegalStateException: §2.12 violated: onSubscribe must be called at most once

java.lang.IllegalStateException: §2.12 violated: onSubscribe must be called at most once

对于这样的Reactive Streams规范不符合异常/问题,应该:

  • 是容易复现的
  • TTL关系不大

    • 如果TTL的逻辑有问题,往往不是「偶发」
    • 因为目前是「偶发」,所以不使用TTL也可能一样是有问题的,只是运行的时长不够 @KIDOSTAS 不使用TTL时你运行了多久,相比 使用TTL时运行了多久?

@oldratlee 有demo了 可以把标签移除掉嘛