Open OlegMazurov opened 3 weeks ago
Stale events cannot be avoided. What we can do is provide these events to the app which can choose to resubmit them.
I agree, this is something we should fix. It is covered in the design for consensus nodes proposal. I don't know the timeline for the fix, but we definitely should do this.
@rbair23, I have marked this ticket as high priority in the Platform Backlog project
Description
With additional logging to
DefaultStaleEventDetector.addConsensusRound()
to report stale events, I observe:This means that the event went stale 9 seconds after it was created. There were 109 transactions in the event, mostly user transactions. All those transactions were silently dropped (only system transactions are resubmitted). However, transaction records remained cached and
TransactionReceiptQueries
would returnOK
until transaction expiration - for another 171 seconds. Finally, client getsRECEIPT_NOT_FOUND
. It has to check the status of the transaction with the mirror node only to find out that it has not been executed, so the transaction needs to be resubmitted by the client. All that creates poor user experience. It also affects performance testing as pending transactions decrease throughput.Steps to reproduce
Stale events and their effect were observed in a performance network (
engnet1
) when running theNftTransferLoadTest
benchmark at ~10K TPS.Additional context
No response
Hedera network
other
Version
v0.54.0-SNAPSHOT
Operating system
Linux