Closed egonspace closed 5 months ago
given issue could be solved by executing refreshPending()
method right after/before node is trying to acquire mining token.
currently, refreshPending() method is executed when node failed to acquire mining token and the node who got the mining token updates pending states after the block is made which takes certain length of sleep time. ( to adjust blockCreationTime)
so if refreshPending() is executed whether or not mining token is acquired, pending state will update the current block as soon as it's propagated.
// before
func (w *worker) commitWork(interrupt *int32, noempty bool, timestamp int64) {
...
if !wemixminer.IsPoW() {
parent := w.chain.CurrentBlock()
height := new(big.Int).Add(parent.Number(), common.Big1)
ok, err := wemixminer.AcquireMiningToken(height, parent.Hash())
if ok {
log.Debug("Mining Token, successful", "height", height, "parent-hash", parent.Hash())
} else {
log.Debug("Mining Token, failure", "height", height, "parent-hash", parent.Hash(), "error", err)
}
if !ok {
w.refreshPending(true)
return
}
}
...
// after
func (w *worker) commitWork(interrupt *int32, noempty bool, timestamp int64) {
...
if !wemixminer.IsPoW() {
parent := w.chain.CurrentBlock()
height := new(big.Int).Add(parent.Number(), common.Big1)
ok, err := wemixminer.AcquireMiningToken(height, parent.Hash())
w.refreshPending(true)
if ok {
log.Debug("Mining Token, successful", "height", height, "parent-hash", parent.Hash())
} else {
log.Debug("Mining Token, failure", "height", height, "parent-hash", parent.Hash(), "error", err)
}
if !ok {
return
}
}
...
need more check if any side effects though..
In Ethereum v1.10.15 using PoW, Engine.FinalizeAndAssemble()
(called in w.commit()
) comes after w.updateSnapshot()
.
In Ethereum v1.10.16, Engine.FinalizeAndAssemble()
comes before w.updateSnapshot()
.
The change is applied by commit https://github.com/ethereum/go-ethereum/commit/78636ee56856ef50299183dd04d02a3e7f555cbc.
I cannot find the commit message or discussion why the order is changed.
It seems that the snapshot which is exposed to public is not so critical for synchronization.
There is a short period that is out of sync.
In Wemix, however, (*worker).commitTransactionsEx()
sleeps until env.till
so the out-of-sync time stands out.
So the waiting must be done before or after both commit and updating snapshot.
For reference, this issue does not occur on the mainnet. This is because miners on the mainnet do not receive transactions directly.
There are two ways to resolve this issue:
Avoid sleeping in worker.commitTransactionEx()
and instead perform updateSnapshot()
in worker.commitEx()
, followed by sleep(blockInterval)
.
Call refreshPending()
before sleeping in worker.commitTransactionEx()
.
Option 1 is more desirable because it reflects the transactions of the currently created block in the pending state. In option 2, since the transaction receipts cannot yet be obtained, it is not possible to reflect the currently processed transactions in the pending state, and the only option is to refresh with the state of the latest block.
After investigating this issue, I hope to solve the following three problems.
updateSnapshot()
before going to sleep during block intervallatestBlock
, not a pendingBlock
, in eth.estimateGas
and eth.createAccessList
newWork
in newWorkLoopEx()
if it is a non-mining nodethe issue is fixed in #111. close the issue
System information
All go-wemix version.
Expected behaviour
Some tx to a mining node is failed as the tx processor does not refer the latest state. But other call requests getting some info are processed successfully.
The difference between the tx request and the information inquiry request is that the tx request gets the state through the
pending
block, and the information inquiry request gets the state through thelatest
block.When we analyzed the cause, it was a problem that the update of the pending block was a little late, in Wemix, the logic of updating the
pending
block in the EN node and mining node is slightly different.Here is debugging log.
Based on the above log, the following logic sequence can be found.
mining token
(not mining in this turn)mining token
.Expected behavior
mining node should update pending block as soon as possible when it receives new block
Steps to reproduce the behaviour
reproduction conditions:
Backtrace
When submitting logs: please submit them as text and not screenshots.