Closed gitmewai closed 2 years ago
Please note that the issue persists at the latest v22.7.6
release
Thanks for reporting @gitmewai. The metric displayed by Collectd is the one in /proc/stat, that we can also display with vmstat -f. I was able to reproduce, I'm looking into this to understand what causes this metric to grow that fast.
This is what we observed from the system log @ahamlat :
Oct 11 18:15:09 besu-node besu[847]: 2022-10-11 18:15:09.656+00:00 | vertx-blocked-thread-checker | WARN | BlockedThreadChecker | Thread Thread[vert.x-worker-thread-14,5,main] has been blocked for 20195242 ms, time limit is 60000 ms
Oct 11 18:15:09 besu-node besu[847]: io.vertx.core.VertxException: Thread blocked
Oct 11 18:15:09 besu-node besu[847]: at java.base@11.0.16/jdk.internal.misc.Unsafe.park(Native Method)
Oct 11 18:15:09 besu-node besu[847]: at java.base@11.0.16/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
Oct 11 18:15:09 besu-node besu[847]: at java.base@11.0.16/java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1796)
Oct 11 18:15:09 besu-node besu[847]: at java.base@11.0.16/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3128)
Oct 11 18:15:09 besu-node besu[847]: at java.base@11.0.16/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1823)
Oct 11 18:15:09 besu-node besu[847]: at java.base@11.0.16/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1998)
Oct 11 18:15:09 besu-node besu[847]: at app//org.hyperledger.besu.ethereum.api.jsonrpc.internal.methods.ExecutionEngineJsonRpcMethod.response(ExecutionEngineJsonRpcMethod.java:89)
Oct 11 18:15:09 besu-node besu[847]: at app//org.hyperledger.besu.ethereum.api.jsonrpc.execution.BaseJsonRpcProcessor.process(BaseJsonRpcProcessor.java:42)
Oct 11 18:15:09 besu-node besu[847]: at app//org.hyperledger.besu.ethereum.api.jsonrpc.execution.TracedJsonRpcProcessor.process(TracedJsonRpcProcessor.java:41)
Oct 11 18:15:09 besu-node besu[847]: at app//org.hyperledger.besu.ethereum.api.jsonrpc.execution.TimedJsonRpcProcessor.process(TimedJsonRpcProcessor.java:45)
Oct 11 18:15:09 besu-node besu[847]: at app//org.hyperledger.besu.ethereum.api.jsonrpc.execution.AuthenticatedJsonRpcProcessor.process(AuthenticatedJsonRpcProcessor.java:51)
Oct 11 18:15:09 besu-node besu[847]: at app//org.hyperledger.besu.ethereum.api.jsonrpc.execution.JsonRpcExecutor.execute(JsonRpcExecutor.java:91)
Oct 11 18:15:09 besu-node besu[847]: at app//org.hyperledger.besu.ethereum.api.handlers.JsonRpcExecutorHandler.lambda$handler$8(JsonRpcExecutorHandler.java:79)
Oct 11 18:15:09 besu-node besu[847]: at app//org.hyperledger.besu.ethereum.api.handlers.JsonRpcExecutorHandler$$Lambda$760/0x0000000840509c40.handle(Unknown Source)
Oct 11 18:15:09 besu-node besu[847]: at app//io.vertx.ext.web.impl.BlockingHandlerDecorator.lambda$handle$0(BlockingHandlerDecorator.java:48)
Oct 11 18:15:09 besu-node besu[847]: at app//io.vertx.ext.web.impl.BlockingHandlerDecorator$$Lambda$1098/0x000000084069c040.handle(Unknown Source)
Oct 11 18:15:09 besu-node besu[847]: at app//io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:159)
Oct 11 18:15:09 besu-node besu[847]: at app//io.vertx.core.impl.ContextImpl$$Lambda$975/0x0000000840612c40.handle(Unknown Source)
Oct 11 18:15:09 besu-node besu[847]: at app//io.vertx.core.impl.AbstractContext.dispatch(AbstractContext.java:100)
Oct 11 18:15:09 besu-node besu[847]: at app//io.vertx.core.impl.ContextImpl.lambda$executeBlocking$1(ContextImpl.java:157)
Oct 11 18:15:09 besu-node besu[847]: at app//io.vertx.core.impl.ContextImpl$$Lambda$972/0x0000000840613840.run(Unknown Source)
Oct 11 18:15:09 besu-node besu[847]: at java.base@11.0.16/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
Oct 11 18:15:09 besu-node besu[847]: at java.base@11.0.16/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
Oct 11 18:15:09 besu-node besu[847]: at app//io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
Oct 11 18:15:09 besu-node besu[847]: at java.base@11.0.16/java.lang.Thread.run(Thread.java:829)
I don't think it is related @gitmewai. The log you showed just tells us that there is an engine RPC call taking a very long time, so the CompletableFuture is blocked on get method waiting for the Async call to complete. I think we should improve that and add a timeout to avoid blocking for a very long time. The only impact I was able to see with these forks was on context switches and interrupts which makes sens.
@gitmewai We found the root cause of these fork processes, they're actually very short living threads created in our transactions and workers ThreadPoolExecutors. By doing Wall clock profiling with Async Profiler, I was able to see very short lived threads, with this name pattern : EthScheduler-Transactions-id.
In this code line, We're creating a ThreadPoolExecutor with coreSize=0 and keepAliveTime=0, this means for each execution, a new thread is created. By changing coreSize parameter, we can see less fork processes (threads in this case) :
Description
High fork rate detected on besu since
v22.7.3
(and the same atv22.7.5
)Acceptance Criteria
High fork rate is not detected before besu
v22.7.2
or earlierSteps to Reproduce (Bug)
v22.7.3
v22.7.4
orv22.7.5
v22.7.2
and high fork rate will not be observedExpected behavior: Fork rate should not be high
Actual behavior: Fork rate is high with the recent releases Goerli testnet running
v22.7.3
since 26-Sep-2022:Mainnet just upgraded to
v22.7.3
on 27-Sep-2022, then fallback tov22.7.2
on the same day, upgraded tov22.7.5
since 7-Oct-2022:Frequency: Occurs on all releases since
v22.7.3
Versions (Add all that apply)
besu --version
]besu/v22.7.5/linux-x86_64/openjdk-java-11
java -version
]cat /etc/*release
]uname -a
]vmware -v
]N/A
docker version
]N/A
AWS r5.xlarge
for goerli testnetAWS c5.4xlarge
for mainnetSmart contract information (If you're reporting an issue arising from deploying or calling a smart contract, please supply related information)
solc --version
]N/A
Additional Information (Add any of the following or anything else that may be relevant)
Besu config:
Teku config:
Teku version: teku/v22.9.1/linux-x86_64/-debian-openjdk64bitservervm-java-11
pstree -glpt
### for besu and teku processes