deso-protocol / dips

DeSo Improvement Proposals for improving the DeSo network
MIT License
61 stars 14 forks source link

Hyper Sync #216

Closed AeonSw4n closed 2 years ago

AeonSw4n commented 2 years ago

Growth is inevitable.

tijno commented 2 years ago

Thanks for posting this @AeonSw4n - huge amount of work clearly.

Will need some time to digest and review - and most of it is beyond my skill set to properly assess.

Initial key points that I think are not addressed in the DIP that should be:

  1. This change has significant potential security and stability impacts as you highlight. The DESO dev/node community is small and i suspect not many people will have the experience and skills to review & assess these risks. What level of independent and expert review outside of DESO will be involved ?

  2. There are several other DIPs and potential changes still in progress besides this one - how are these impacted? Specifically thinking about:

A. Balance model DIP B. Proof of Stake C. Potential Coin Split

  1. A week of Testnet QA and 2 weeks notice before golive - based on our experience in the last HF - could be very tight

  2. I think its essential this PR includes clear and uptodate documentation and examples before QA on Testnet so the community can help and assist with testing.

arhebbar commented 2 years ago

Thank you for all your work team. I've posted my comments on chain as well. Huge respect for all your work.

https://diamondapp.com/posts/c29748640a0058f4fdb3aa62601d7ac9f4667720cd4ee6a940e3f347d7e16a14?feedTab=Following

Quick summary here:

A few suggestions which hopefully you already have in your radar:

AeonSw4n commented 2 years ago

Thank you @tijno @arhebbar for great feedback! I appreciate the time you took to review my DIP and insigtful comments, especially considering the complexity of this change. Now onto your points:

  1. This change has significant potential security and stability impacts as you highlight. The DESO dev/node community is small and i suspect not many people will have the experience and skills to review & assess these risks. What level of independent and expert review outside of DESO will be involved ?

Identify Breaking Test-Cases (e.g. What happens if a Node changes its parameter before / after / during the first sync process, What happens if the Node doesn't complete the HyperSync and decides to change back to No-Hyper-Sync, etc.)

The main exploit for this change that I'm concerned about is that somehow the Hyper Sync node would arrive at an incorrect state. This can either be caused through mallicious nodes sending wrong messages, or through a software bug. EllipticSum computation prevents this in theory - if the state we've received is incorrect then the state hash would differ. However, software bugs are obviously harder to spot but I think the new node testing framework that I'm designing will eliminate all high-severity bugs and most if not all low-severity bugs. We're not planning on auditing the code externally but I have discussed EllipticSum with some of my academia contacts to confirm its security. Re other exploits that would cause downtime/unexpected behavior - I wrote the code in a way so that we're taking "health checks" of the snapshot, which automatically detect if something went wrong and will stop Hyper Sync computation. Tweaking paramaters mid-execution is generally not a good practice but it's a good edge-case and I'll add checks for this. The solution to prevent DB corruption is to clear the database once this behavior was detected.

  1. There are several other DIPs and potential changes still in progress besides this one - how are these impacted? Specifically thinking about:

A. Balance model DIP B. Proof of Stake C. Potential Coin Split

Blockchain transactions are not impacted so it won't have effect on A and C. I consider Proof of Stake an umbrella of core changes that primarily focus on moving away from PoW but als improve the speed and scalability of our protocol with changes like Hyper Sync and other exciting updates to come. To tease some of these incoming changes: you can expect some blazing fast TPS as we'll have parallel transaction validation :).

  1. A week of Testnet QA and 2 weeks notice before golive - based on our experience in the last HF - could be very tight

I'm optimistic we'll do better this time!

  1. I think its essential this PR includes clear and uptodate documentation and examples before QA on Testnet so the community can help and assist with testing.

Agree with Tijn on need for more testing and involving community in it explicitly even if you have to clone their current servers and work with them to do it to build greater confidence in the dev community on DESO. Documentation might of course help reduce and maybe even avoid it so that the community can test on their own.

Yup, great point! We'll update the docs and also launching a Hyper Sync node is as simple as adding a --hyper-sync=true flag to the node configuration. The node will take care of the rest. I expect that the dev community will generate a good amount of entropy to test the edge cases.

Will there be enough nodes that will sync (I am assuming these will be all core managed for now). So, may not be an issue. But, is the code set to handle it.

Yup! If there are no Hyper Sync peers we'll do a traditional block sync.

carlosaponte1 commented 2 years ago

ffasfasfasfasfasfas

carlosaponte1 commented 2 years ago

mesms

carlosaponte1 commented 2 years ago

czxczxczxczxcxzczxczxc