w3f / Grants-Program

Web3 Foundation Grants Program
https://grants.web3.foundation/
Apache License 2.0
1.04k stars 2.05k forks source link

Amendment for Subcoin milestone3 #2376

Closed liuchengxu closed 2 months ago

liuchengxu commented 2 months ago

In light of recent developments, it has become evident that fully syncing to the tip of the Bitcoin network and enabling new nodes to perform fast sync to the latest Bitcoin state is more challenging than initially anticipated, caused by the huge state of UTXO set (over 12GiB, much larger than any existing Substrate chains to my knowledge). We'll also discuss the fast sync challenge in the article delivery at this milestone. As a result, I would like to propose adjusting the delivery goal for this milestone.

The most significant known blocker is https://github.com/paritytech/polkadot-sdk/issues/4. Other underlying issues may also contribute to the difficulty. Recent experiments have shown that fast sync from around block height 580,000 is currently infeasible, succeeding only on machines with 128GiB of memory (https://github.com/paritytech/polkadot-sdk/issues/5053#issuecomment-2296043492), which is impractical for most users. Nevertheless, we have successfully demonstrated that decentralized fast sync is possible within a prototype implementation.

While syncing to the Bitcoin network's tip remains a future target, addressing the existing technical challenges will require substantial R&D efforts. We remain committed to exploring potential solutions, including architectural changes and contributing to resolving issue https://github.com/paritytech/polkadot-sdk/issues/4,

PieWol commented 2 months ago

Hey @liuchengxu , thanks for reaching out about this issue. Simply out of interest, have you tried to allow for a huge swap file so that this might be successful even on machines without 128gb of RAM?

liuchengxu commented 2 months ago

have you tried to allow for a huge swap file so that this might be successful even on machines without 128gb of RAM?

This is interesting. After increasing the swap to 128GiB locally, I successfully completed a fast sync from a node at height 612,272. The peak memory usage was around 128GiB of RAM plus 50GiB of swap, importing the state took about 50 minutes on my machine. However, after importing the synced state, the node took an unreasonably long time to restart. It appears that Substrate encounters significant challenges when handling state at this scale.

To conclude, a fast sync from height ~580,000 might be still possible with 128GiB of RAM and just 2GiB of swap. But when syncing from height 612,272, the memory consumption rose to 128GiB of RAM plus 50GiB of swap, which is unsustainable considering there are still over 240,000 blocks left to sync. Extrapolating from these numbers, syncing the latest Bitcoin state might require 128GiB RAM plus swap ~400GiB, which is simply unmanageable.