status-im / nimbus-eth2

Nim implementation of the Ethereum Beacon Chain
https://nimbus.guide
Other
543 stars 233 forks source link

BN: Disable genesis sync via long-range-sync argument. #6361

Closed cheatfate closed 4 months ago

tersec commented 5 months ago

One quality-of-implementation thing here is that while it's true that to catch the general case, it has to live in the sync manager, because of the limitation of TNS not to work with existing databases, this means that this currently will, in a new --data-dir:

From a UX flow perspective this is fine in isolation for now, but the trouble is that the next step is made harder by requiring the user to manually delete the database it pointlessly created which now does nothing but hinder the usage of TNS. Once light client sync exists, and if/when we can get an Altair or newer finalized state always available (already true for Sepolia and Holesky, not true for mainnet yet), this won't be a concern, but it's currently sort of UX jank.

It already kind of exists now, and this PR doesn't really make it worse, but it's also probably the right place to fix it, because it's already the PR adding the weak subjectivity period detection. The additional check would be at the beginning, checking as a special case if slot 0 (genesis) is within the WSP of the wall slot, the same way as it currently checks whether the head state is. The difference is that the head state conceptually can't exist until the database does, while the check can be performed specially for GENESIS_SLOT even before the database is created.

github-actions[bot] commented 5 months ago

Unit Test Results

         9 files  ±0    1 328 suites  ±0   26m 46s :stopwatch: - 6m 26s   4 998 tests ±0    4 650 :heavy_check_mark: ±0  348 :zzz: ±0  0 :x: ±0  20 850 runs  ±0  20 446 :heavy_check_mark: ±0  404 :zzz: ±0  0 :x: ±0 

Results for commit 3877e990. ± Comparison against base commit 597f4731.

:recycle: This comment has been updated with latest results.

tersec commented 5 months ago

References https://github.com/status-im/nimbus-eth2/issues/6306

tersec commented 5 months ago

It already kind of exists now, and this PR doesn't really make it worse, but it's also probably the right place to fix it, because it's already the PR adding the weak subjectivity period detection. The additional check would be at the beginning, checking as a special case if slot 0 (genesis) is within the WSP of the wall slot, the same way as it currently checks whether the head state is. The difference is that the head state conceptually can't exist until the database does, while the check can be performed specially for GENESIS_SLOT even before the database is created.

Done in https://github.com/status-im/nimbus-eth2/pull/6361/commits/228a3b2ea46fa0c3c2b5e80bf8b0bbbe74b3db8a

arnetheduck commented 4 months ago

So with long-range-sync=light, the default mode, we should be performing a light client sync automatically if we have sync committees available - does that happen now?

tersec commented 4 months ago

So with long-range-sync=light, the default mode, we should be performing a light client sync automatically if we have sync committees available - does that happen now?

No. Light client sync does not actually exist in Nimbus right now in a way integrated with the rest of sync.

There are some options of how to approach, including but not limited to

I don't think it makes sense to remove it.

Part of the point from my perspective is, right now, regardless of light client sync, we repeatedly have people come to help channels having inadvertently triggered genesis sync because it's default. Putting WSP concerns aside even, just a end-user UX it's bad, because people mostly don't want this, but they show up a day or two after genesis sync has started. It's ok to allow genesis sync, and I don't actually think the trusted node sync idea of separate commands is that bad, but the combination of:

predictably, repeatedly, reliably creates scenarios where someone tries to do a TNS, falls into a genesis sync, and just thinks things are a bit slower than they should be.

So, it's better regardless to make sure that genesis sync is what people want. The people who want archive modes, and therefore currently need genesis sync (though in theory one could adjust backfill to get archive mode and 100% deprecate this genesis sync, this discussion has been had, so, sure) to create archive nodes already have to specify custom settings for this.