filecoin-project / devgrants

👟 Apply for a Filecoin devgrant. Help build the Filecoin ecosystem!
Other
375 stars 308 forks source link

RFP Application - chain-co #770

Closed Fatman13 closed 1 year ago

Fatman13 commented 2 years ago

RFP Proposal: A high availability solution for Filecoin Daemon

Name of Project: Chain-co

Proposal Category: core-dev

Proposer: Force community Team

(Optional) Technical Sponsor: N/A

Do you agree to open source all work you do on behalf of this RFP and dual-license under MIT and APACHE2 licenses?: Yes

Project Description

At the time of writting this proposal, Filecoin network has reached ~17 EiB storage power with ~20 to ~30 PiB growth per day. Given that more than 400 nodes are having 10+ PiB of storage power with top nodes even having 100+ PiB storage power, a high availability solution for Filecoin daemon would be crucial to the stability and overall health of the network. Imagine the Filecoin daemon node for a 100 PiB storage system failed to sync height with network or experienced severe hardware failure for an extended period of time, storage power would be wiped out and causing turbulence to the network, which is not ideal at all.

In order to mitigate the risk of Filecoin daemon failure and to secure a generally sane and coherent network, we propose chain coordinator, chain-co for short, to coordinate a collection of redundant daemon nodes and let the storage system to use the daemon that has the heaviest chain (most weighed). Thus, paves the way for a stronger and more resilient Filecoin network.

Value

The benefits of chain-co to the long-term health of Filecoin infrastructure is profound. For storage providers, they can now rest assure that there wouldn’t be a single point of failure for their storage systems. For ecosystem partners, they can now build upon a more reliable infrastructure for their applications. For end storage user, they can also enjoy a more consistent storage service without denials. Imagine an ideal world that each of the four filecoin implementation represents 25% of the network and chain-co have a collection redundancy daemons of the four implementations. In case when one implementation fails to sync, chain-co solution could help to make sure that the Filecoin network wouldn’t suffer denial of service.

On the other hand, the risk of not having chain-co infrastructure would deprive storage providers, ecosystem partners and end storage user the exact benefits described above, which also represent a systemic risk that the whole Filecoin network come into a halt if one implementation of the Filecoin fail to sync.

Deliverables

Chain-co solution will be running in between a pool of redundant Fileccoin daemons of different flavors and a proxy agent. Essentially, the complete chain-co solution will be replacing single daemon solution to make sure that your storage systems have access to all the critical RPC APIs you need for PoRep and PoST.

arch

Development Roadmap

Milestone 1 - MVP

Develop a minimum viable POC

Technical scope:

Deliverables:

Pull request merged on the repository with a release tagged at m1.1

Funding for m1.1:

Total = $19520

Estimated Milestone 1 Delivery:

Estimated to be around a month and a half (End of Nov 22)

Milestone 2 - Test

Milestone 2.1 - Test on Calib

In this phase, chain-co solution would be put to the test of calib for iterations and fixes. Documentations will also be improved.

Technical scope:

Deliverables:

Pull request merged on the repository with a stable release tagged at m2.1

Funding for m2.1:

Total = $9760

Milestone 2.2 - Test on Mainnet

In this phase, chain-co solution would be put to the test of mainnet for iterations and fixes. Documentations will also be improved.

Technical scope:

Deliverables:

Pull request merged on the repository with a stable release tagged at m2.2

Funding for m2.2:

Total = $9760

Estimated Milestone 2 Delivery:

Estimated to be around a month and a half (End of Dec 22)

Total Budget Requested

Total = $39,040

Maintenance and Upgrade Plans

Chain-co will be committed to integrate the daemons of all four implementations of Filecoin to offer community a complete high availability solution.

Team Members

Force community engineering team

Team Website

https://forcecommunity.io/

Relevant Experience

Force community has been an active contributor to Web3 ecosystem and Filecoin ecosystem in general. The engineering team from Force community has a track record of contributing code to Lotus as far back as Testnet and Space Race.

Team code repositories

https://github.com/ipfs-force-community

Additional information

Force community is committed to become a major contributor to Web3 infrastructure and we see Filecoin at the core of the big Web3 migration. We hope that we could fast track the realization of Web3 adoption by contributing our software development capacity to the course and join hand in hand with all other ecosystem developers around the globe through this historical journey!

References

ErinOCon commented 2 years ago

Hi @Fatman13, thank you for your proposal. We will not be proceeding with this grant at this time as it does not align with the priorities of our current research & development roadmap. Wishing you the best of luck with your project work!

Fatman13 commented 2 years ago

Hello, @ErinOCon! Thank you for the reply! Would you please kindly share a list of current priorities? Would love to align our goals with Filecoin's research & development roadmap. Thank you!

Fatman13 commented 2 years ago

@ErinOCon btw do you have an account on Slack? Maybe we can have a quick chat on Slack too.

Fatman13 commented 2 years ago

Found it!

Fatman13 commented 2 years ago

There are currently more than 1200+ nodes on the network that is larger than 3PiB power, which means it is likely that the node will have at least one partition to prove for each deadline of the day. Falling out of sync and failing proof would mean devastating loss for SPs.

With the introduction of https://github.com/filecoin-project/FIPs/discussions/386, it would make stakes in the node up to 100x higher, which in turn making potential loss/penalty also up to 100x. Therefore we think that chain-co could be a crucial piece of infrastructure to keep the network going forward.

ErinOCon commented 2 years ago

HI @Fatman13, thanks for reaching out! I have provide resources on the grants-help channel in Slack!

jennijuju commented 2 years ago

Fuhon is no longer being maintained and does not sync mainnet.

jennijuju commented 2 years ago

I have a couple Q on this proposal (not sure whats the status on this one given its closed)

Fatman13 commented 2 years ago

Hi, @jennijuju, thanks for the reply!

Fuhon is no longer being maintained and does not sync mainnet.

Sorry to hear that. 😢

who runs the chain-co? SP themselves or some service providers?

For Lotus, it will be SP themselves. For Venus, it can be a chain service operator (a trusted third party).

whats the main difference between Venus node & lotus node today? AFAIK, majority of the protocol changes were made in lotus and back ported to Venus, does Venus normally then do a round of audit or optimization,

There are many subtle differences but the general consensus protocol should be followed by both. For example, Venus experienced this consensus issue earlier this year.

There is currently one audit report conducted for Venus.

so that if something bad occurs with lotus node -> the same wont happen to Venus nodes?

Right now given the market share, even if "something bad occurs with lotus node", the consensus of the network will still be of what Lotus considers to be true as Lotus is the overwhelming majority.

Pool of chain-co daemon nodes can be deployed across different geo regions to gain best chain synchronization from block propagation.

im highly interested in interchangeability between go impls & forest nodes!!

Maybe we can reorganize our proposal around this if that's what you consider to be most helpful to the network? We can cut couple things that may not be as important and revise the proposal to align with the long term goals of the network.

Fatman13 commented 2 years ago

Hello, @realChainLife, glad chatting with you yesterday!

We have made modifications to our proposal for your revision.

Changelog

Motivation updated

Fatman13 commented 2 years ago

This proposal solves the exact problem pointed out here.

jennijuju commented 2 years ago

lotus is planning to implement it natively for our users FYI - so not sure what an external service will bring.

Fatman13 commented 2 years ago

lotus is planning to implement it natively for our users FYI - so not sure what an external service will bring.

There are a lot of things lotus could be planning, but features like PoST worker in the past the community had been waiting for close to a year for it to be implemented in lotus. I think this proposal doesn't necessarily conflicts with what lotus is planning, besides the approach we take is fundamentally different where our solution opens up collaboration with all potential Filecoin implementations. Overall, I think both approaches can be complementing each other to have a more secure network.

jennijuju commented 2 years ago

Sure - I’m just sharing information as an ongoing wip lotus issue is being referenced. And I wanna make sure lotus users who read this message knows that we are implementing a native solution, and they can decide whether they should plan for an external service in their architecture or not.

Looking forward ti Venus to implement a solution for Venus user and broader community even sooner.

jennijuju commented 2 years ago

I think this proposal doesn't necessarily conflicts

to be clear, I have never implied there is a conflict. in fact - I reached out to the grant team when I noticed this RFP was closed, cuz I do think it needs better scope but has interesting points.

ErinOCon commented 2 years ago

Hi @Fatman13, thank you for your patience with our review. We expect to have an update available next week.

ErinOCon commented 2 years ago

Hi @Fatman13, Thank you again for your patience with our review process. After additional evaluation, we will not award funding for the scope of work outlined here. We would be happy to consider a new proposal with value for the broader Filecoin ecosystem.

Fatman13 commented 2 years ago

Got it! Thanks for taking time to revaluate the application!