Closed tclemos closed 3 years ago
Currently, the synchronization module starts from the blocks after the creation of the smart contract and then goes block by block
looking for smart contract logs in order to synchronize the data from L1 to L2.
The idea is to explore better the geth
available resources like GraphQL
or even filter a bunch of blocks at once instead of one by one in order to decrease the time needed to synchronize.
In order to evaluate speed improvements and compare the different executions, we will use these fixed parameters:
Machine size: AWS c4.xlarge OS: Ubuntu 20.04 Hermez node version: 1.7.0-rc1 Geth version: 1.10.7 running at 127.0.0.1 Ethereum Network: Goerli - Internal Testnet Start Block: 4417197 Final Block: 5332486 Config:
[Synchronizer]
SyncLoopInterval = "100ms"
StatsUpdateBlockNumDiffThreshold = 1000
StatsUpdateFrequencyDivider = 1000
We will perform at least 3 full synchronizations and see how long it takes from the first block until the final block specified above.
In order to wipe the data in the node we will use the wipedbs
command, like so:
heznode wipedbs --mode coord --cfg cfg.buidler.toml
in order to execute the node we will use the run
along with sync
mode, like so:
nohup heznode run --mode sync --cfg cfg.buidler.toml > hermez-node.log 2>&1 &
1st Attempt: 5 hours, 34 minutes and 44 seconds
...
2021-08-18T12:21:05Z ^[[35mDEBUG^[[0m synchronizer/synchronizer.go:554 Syncing... {"block": 4417197, "ethLastBlock": {"Num":5341209,"Timestamp":"2021-08-18T12:20:55Z","Hash":"0xf105712e0ebc266c4e1d662d2e845dee127db7ea8951f01df5eaa44a62af031b"}}
2021-08-18T12:21:05Z ^[[35mDEBUG^[[0m synchronizer/synchronizer.go:674 Synced block {"syncLastBlockNum": 4417197, "syncBlocksPerc": 0.0001082235855989039, "ethLastBlockNum": 5341209}
...
2021-08-18T17:55:49Z ^[[35mDEBUG^[[0m synchronizer/synchronizer.go:554 Syncing... {"block": 5332486, "ethLastBlock": {"Num":5342547,"Timestamp":"2021-08-18T17:55:41Z","Hash":"0xf08d47d62d5fc297a065d40b568c0faf822ab55d28c5e89a261dfee33dfd0bf5"}}
2021-08-18T17:55:49Z ^[[35mDEBUG^[[0m synchronizer/synchronizer.go:674 Synced block {"syncLastBlockNum": 5332486, "syncBlocksPerc": 98.91273689659383, "ethLastBlockNum": 5342547}
...
2nd Attempt: 5 hours, 44 minutes and 47 seconds
...
2021-08-18T18:23:05Z ^[[35mDEBUG^[[0m synchronizer/synchronizer.go:554 Syncing... {"block": 4417197, "ethLastBlock": {"Num":5342656,"Timestamp":"2021-08-18T18:22:57Z","Hash":"0x1fe6bf009878310a0ba881b4aaaa538e31f4271b738f6ad66131ccbf90be7025"}}
2021-08-18T18:23:05Z ^[[35mDEBUG^[[0m synchronizer/synchronizer.go:674 Synced block {"syncLastBlockNum": 4417197, "syncBlocksPerc": 0.00010805437296047371, "ethLastBlockNum": 5342656}
...
2021-08-19T00:07:52Z ^[[35mDEBUG^[[0m synchronizer/synchronizer.go:554 Syncing... {"block": 5332486, "ethLastBlock": {"Num":5344035,"Timestamp":"2021-08-19T00:07:42Z","Hash":"0x0b3923149c5dd3bcf92781d547ab324e6c25d33fd81fc2cab02fbc87bc396904"}}
2021-08-19T00:07:52Z ^[[35mDEBUG^[[0m synchronizer/synchronizer.go:674 Synced block {"syncLastBlockNum": 5332486, "syncBlocksPerc": 98.75393676787446, "ethLastBlockNum": 5344035}
...
Geth
exposes a GraphQL
interface to provide informations from the Ethereum network
You can access the GraphQL
interface of your node via browser accessing the /graphql/ui
url of your geth node.
if it is not available, make sure you executed Geth
with the flags to enable the GraphQL
interface, like so:
--graphql --graphql.corsdomain '*' --graphql.vhosts '*'
This is the GraphQL
query in order to get information from multiple blocks at once:
{
blocks(from: 4417197, to: 4427197) {
number
hash
parent {
hash
}
timestamp
}
}
If you allow me to give a suggestion to help you @tclemos , this example below shows you can also refine the query results by contract rollup address and the event:
{
blocks(from: 4417196, to: 5346878) {
transactions {
hash
from {
address
}
logs {
account {
address
}
topics
}
}
}
logs(filter: {fromBlock: 4417196, addresses: ["0xf08a226b67a8a9f99ccfcf51c50867bc18a54f53"], topics: [["0x7cae662d4cfa9d9c5575c65f0cc41a858c51ca14ebcbd02a802a62376c3ad238"]]}) {
transaction {
hash
from {
address
}
block {
number
timestamp
}
}
}
}
For sure, also Geth Ethereum Client
has a method to query the logs
by range, like this:
q := ethereum.FilterQuery{
FromBlock: big.NewInt(fromBlock),
ToBlock: big.NewInt(toBlock),
Addresses: []common.Address{
rollUpAddress,
actionAddress,
withdrawalDelayerAddress,
},
}
logs, err := client.FilterLogs(ctx, q)
But I couldn't find a method to query blocks
by range, this is the reason I'm exploring the GraphQL
interface.
I see... we are talking about only GraphQL. BTW, I've made some adjustments. This is a query it's working perfectly and filtering logs of NewForge and ForgeBatch events:
{
logs (filter: {fromBlock: 5307649, topics: [["0xe00040c8a3b0bf905636c26924e90520eafc5003324138236fddee2d34588618", "0x7cae662d4cfa9d9c5575c65f0cc41a858c51ca14ebcbd02a802a62376c3ad238"]]}) {
account {
address
}
transaction {
hash
from {
address
}
}
topics
}
}
You can try it against your GETH node in Goerli and it should work. I've tried it successfully using @mfcastellani 's one.
After our conversation, this is a query to filter the ForgeBatch event:
{
logs(filter: {fromBlock: 4417196, topics:
[
["0x7cae662d4cfa9d9c5575c65f0cc41a858c51ca14ebcbd02a802a62376c3ad238"]
]}) {
account {
address
}
transaction {
hash
from {
address
}
inputData
}
topics
}
}
3rd Attempt: 5 hours, 40 minutes and 9 seconds
...
2021-08-19T11:15:10Z ^[[35mDEBUG^[[0m synchronizer/synchronizer.go:554 Syncing... {"block": 4417197, "ethLastBlock": {"Num":5346704,"Timestamp":"2021-08-19T11:15:03Z","Hash":"0x5322e08a12457b480b0bfc06ed0e9c4c9283e793b5af1c83fa37f813b1398dce"}}
2021-08-19T11:15:10Z ^[[35mDEBUG^[[0m synchronizer/synchronizer.go:674 Synced block {"syncLastBlockNum": 4417197, "syncBlocksPerc": 0.00010758379701949848, "ethLastBlockNum": 5346704}
...
2021-08-19T16:55:19Z ^[[35mDEBUG^[[0m synchronizer/synchronizer.go:554 Syncing... {"block": 5332486, "ethLastBlock": {"Num":5348064,"Timestamp":"2021-08-19T16:55:04Z","Hash":"0x1dcc45166e8e8b64651c8a58b551cd31bd18c8f8c5c1572d548b5f2d75cf4b63"}}
2021-08-19T16:55:19Z ^[[35mDEBUG^[[0m synchronizer/synchronizer.go:674 Synced block {"syncLastBlockNum": 5332486, "syncBlocksPerc": 98.32650816227435, "ethLastBlockNum": 5348064}
...
Summary:
1st Attempt: 20084 seconds (5 hours, 34 minutes, and 44 seconds) 2nd Attempt: 20687 seconds (5 hours, 44 minutes, and 47 seconds) 3rd Attempt: 20409 seconds (5 hours, 40 minutes, and 9 seconds)
Average: 20393 seconds (5 hours, 39 minutes, and 53 seconds)
IMPORTANT: pending data validation, we just synchronized the blocks but we didn't checked the integrity of the data synchronized
Since the block
5332486
does not have Hermez Smart Contract events, we will be using the closest blocks before and after with events.
Last block before with events: 5308649
Ref block: 5332486
First block after with events: 5348449
Instead of running through all the blocks one by one, we try to find in the next 10000 blocks, the next block in the sequence with events in the RollUp, Auction, and Withdrawal Delayer contracts in order to sync only blocks that contain significant data to be synchronized. Once blocks with events are found, they are stored in memory until we synchronize them all and then start over the next cycle to find the next ones until the current block of the network chain. Only blocks with events in the contracts are synchronized, all the other blocks are ignored since there is no data to be synchronized.
It's possible to see a slight increase in the speed of the synchronization process. We downed the amount of time from an average of 20393 seconds to 17351 seconds, which in short means we increased the speed in 15%.
Since Hermez network is not running at its full capacity, we have a lot of ethereum blocks we can ignore, the blocks ignored tend to reduce over time with the adoption of the network.
...
2021-08-20T16:48:38Z ^[[35mDEBUG^[[0m synchronizer/synchronizer.go:561 Syncing... {"block": 4417197, "ethLastBlock": {"Num":5353797,"Timestamp":"2021-08-20T16:48:21Z","Hash":"0x76a2044a3315a9f7be3f5df0c5d80e3a2892d2518b237e6362c4e730652eac6c"}}
2021-08-20T16:48:38Z ^[[35mDEBUG^[[0m synchronizer/synchronizer.go:681 Synced block {"syncLastBlockNum": 4417197, "syncBlocksPerc": 0.00010676905106870481, "ethLastBlockNum": 5353797}
...
2021-08-20T21:37:49Z ^[[35mDEBUG^[[0m synchronizer/synchronizer.go:561 Syncing... {"block": 5308649, "ethLastBlock": {"Num":5354712,"Timestamp":"2021-08-20T20:37:06Z","Hash":"0xbf34567c33deb71428b76f71c1b936f8e943ba00fd68a8f5f649cb021d906cac"}}
2021-08-20T21:37:49Z ^[[35mDEBUG^[[0m synchronizer/synchronizer.go:681 Synced block {"syncLastBlockNum": 5308649, "syncBlocksPerc": 95.08669718703467, "ethLastBlockNum": 5354712}
...
2021-08-20T21:37:49Z ^[[35mDEBUG^[[0m synchronizer/synchronizer.go:561 Syncing... {"block": 5348449, "ethLastBlock": {"Num":5354712,"Timestamp":"2021-08-20T20:37:06Z","Hash":"0xbf34567c33deb71428b76f71c1b936f8e943ba00fd68a8f5f649cb021d906cac"}}
2021-08-20T21:37:49Z ^[[35mDEBUG^[[0m synchronizer/synchronizer.go:681 Synced block {"syncLastBlockNum": 5348449, "syncBlocksPerc": 99.33195806791564, "ethLastBlockNum": 5354712}
...
Continuing to explore the new strategy to sync blocks, we decided to move to the next step in order to guarantee the data synchronized is valid, which is forge a new batch
.
The first attempt to forge a batch with this new version has failed, the node was not "triggering" the forging process. The reason behind this problem is that the node was developed to sync the whole chain without skipping any block, considering this, there is a check to define whether the node is synced
or out of sync
and then execute different actions accordingly to this check. Synced
means, the last saved block
in the node is equal to the last eth block
, and since we are skipping the blocks without events if the last block in the chain does not contain an event, it was causing the node to be out of sync
even though it was synced
Also, when a new block is synchronized, the synchronizer
sends a message to the coordinator
in order to update the node state
, since we are skipping all the blocks without SC events in L1
, if the node was restarted and no event happened since it was started, the state
will not be updated with the next forgers
data, because there is no block being synchronized
.
In order to fix this, we must synchronize the last block in the chain every time we synchronize and there is no event between the last block saved and the last eth block in the chain, it works like this:
In summary, when synchronizing a lot of blocks, like starting a new node or starting the node after a period without sync, it is smart enough to skip all the unnecessary blocks in order to save time, after the whole chain is synchronized, it will synchronize all the next blocks to keep the node in Synced
state and allow batches to be forged.
This is the batch I have forged with the version implementing the explanation above:
PR merged #1113
Rationale
https://github.com/hermeznetwork/hermez-node/blob/3fc5b4c531394e9d6a9975e2fa798bbd5029e6a4/synchronizer/synchronizer.go#L97-L100
https://github.com/hermeznetwork/hermez-node/blob/3fc5b4c531394e9d6a9975e2fa798bbd5029e6a4/synchronizer/synchronizer.go#L516
Implementation
Improve the way
StatsHolder
andSynchronizer
handle concurrent read and write operations