filecoin-project / notary-governance

114 stars 58 forks source link

Discussion - Onboarding projects with large DataCap requirements #94

Closed jnthnvctr closed 2 years ago

jnthnvctr commented 3 years ago

There are a few ongoing projects that have substantial DataCap requirements - over and above what exists in the ecosystem today.

1) The Shoah Foundation will require petabyte scale allocations to archive their entire data asset 2) Filecoin Discover will require petabyte scale allocations to archive to onboard their data asset ... Inevitably there will be more

This issue is to kick off discussion about the ways in which as a community we can plan and support early use cases.

To disentangle two issues that I believe arise here: 1) Early in this program we have limited amounts of DataCap in the ecosystem - though in a slightly more mature state this may not be a limitation. I believe there are two approaches here: a) In the rubric today, over subsequent allocations and elections Notaries increase their DataCap allocation - so it is possible that we simply run the process as it exists today and just hold many rounds of elections successively. b) An alternate approach is to define a process (while Notaries have less DataCap than the projects) where these projects can apply to the community to receive an allocation that is purpose built to allocate to this use case (and administered by a set of the existing Notaries). The benefit here is that other use cases that apply in this timeframe for DataCap would not be blocked.

2) No single Notary would be able to service either of these projects properly (or other large scale ones). My proposal here would be that is actually fine - and Notaries should collaboratively support large scale efforts (which also will require additional scrutiny to make sure the Client is using the DataCap appropriately).

s0nik42 commented 3 years ago

My preference tends to be for 1.b . It has the advantage to avoid bottleneck when onboarding a large client without impacting the notaries day to day allocation. Notaries can jointly approved a datacap allocation plan for that specific client. Then a notary can be selected and could get a special allocation for providing the datacap to that specific client accordingly to the plan. Advantages against 2 :

dkkapur commented 3 years ago

Hi folks - proposing the following for getting the discussion going on potential implementation paths:

(Let's define "large client" as a project/use case/Client needing > 500 TiB of DataCap.)

This specifically allows for:

@jnthnvctr @s0nik42 thoughts?

s0nik42 commented 3 years ago

@dkkapur , I like the proposition, I think that type of client will need a single point of contact in any cases to deal with Fil+. I will recommend that we identified one of the 7 notaries to pick up that role when the project start.

Actions could be :

dkkapur commented 3 years ago

@s0nik42 thanks - agreed, we should have a single notary-lead chosen from the set as well. For an initial version of this type of faucet, I would suggest we scope this to the following to ensure that we are on a safe path to test and unblock projects such as Starling without creating too much of a risk for the Fil+ program:

What do you think of this? IMO erring on the side of caution early on to ensure we build safe practices for scaling this up with confidence in the future is a good way to proceed.

s0nik42 commented 3 years ago

Hi @dkkapur, I think this is very good to start with.

dkkapur commented 3 years ago

@s0nik42 thanks! We've had various conversations in Slack and offline with interested Clients and Notaries on this one in the last two weeks, so in tomorrow's Notary Governance call - let's finalize the approach for the initial proposal!

Recommending that we move forward based on the following (updating the bullets I shared above):

These updates enable a more nuanced approach to be taken with what is deemed "fair" or "reasonable" by a select set of Notaries that are then comfortable tracking and enforcing this. We should focus efforts towards building tooling that will help bring transparency into the system to ensure DataCap is being used to make Filecoin more useful!

Tactical next steps include:

dkkapur commented 3 years ago

Draft of some of the questions that need to be included in the client application. Current plan is to manage these applications in a separate repo, i.e., github.com/filecoin-plus-large-clients. Would appreciate feedback on this!

Client Application

Core Information

  • Organization name:
  • Website / social media:
  • Total amount of DataCap being requested:
  • On-chain address to be notarized:

Project details

  • Share a brief history of your project and organization
  • What is the primary source of funding for this work?
  • What other projects/ecosystem stakeholders is this project associated with?

Use-case details

  • Describe the data being stored onto Filecoin
  • Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data)
  • What is the expected retrieval frequency for this data?
  • For how long do you plan to keep this dataset stored on Filecoin? Will this be a permanent archival or a one-time storage deal?

DataCap allocation plan

  • In which geographies do you plan on making storage deals?
  • What is your expected data onboarding rate? How many deals can you make in a day, in a week? How much DataCap do you plan on using per day, per week?
  • How will you be distributing your data to miners? Is there an offline data transfer process?
  • How do you plan on choosing the miners with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.
  • How will you be distributing data and DataCap across miners storing data?
Fenbushi-Filecoin commented 3 years ago

Draft of some of the questions that need to be included in the client application. Current plan is to manage these applications in a separate repo, i.e., github.com/filecoin-plus-large-clients. Would appreciate feedback on this!

Client Application

Core Information

  • Organization name:
  • Website / social media:
  • Total amount of DataCap being requested:
  • On-chain address to be notarized:

Project details

  • Share a brief history of your project and organization
  • What is the primary source of funding for this work?
  • What other projects/ecosystem stakeholders is this project associated with?

Use-case details

  • Describe the data being stored onto Filecoin
  • Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data)
  • What is the expected retrieval frequency for this data?
  • For how long do you plan to keep this dataset stored on Filecoin? Will this be a permanent archival or a one-time storage deal?

DataCap allocation plan

  • In which geographies do you plan on making storage deals?
  • What is your expected data onboarding rate? How many deals can you make in a day, in a week? How much DataCap do you plan on using per day, per week?
  • How will you be distributing your data to miners? Is there an offline data transfer process?
  • How do you plan on choosing the miners with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.
  • How will you be distributing data and DataCap across miners storing data?

Hey Deep, for the required materials on the application, you can use our guidelines for reference, which we believe can better handle big clients. https://github.com/filecoin-project/filecoin-plus-client-onboarding/blob/main/Fenbushi%20Capital/Filecoin%20Plus%20Client%20Onboarding%20Guidellines%20-%20Fenbushi.pdf

dkkapur commented 3 years ago

@Fenbushi-Filecoin - this is great, thank you for sharing! I will look through and propose some changes to this application structure. If there are any specific questions that from your experience have proven to be valuable, please let me know.

dkkapur commented 3 years ago

Per the call this week (2021-04-27 governance call), we're working on getting a v1 implementation of this up and running in the next few weeks! I will keep this issue updated with progress.

dkkapur commented 3 years ago

@Fenbushi-Filecoin - thanks again for sharing your comprehensive DataCap allocation writeup! Here are some things I think we should consider incorporating into the application:

dkkapur commented 3 years ago

Took a deeper dive today into potential sources of issues in a system of this sort, and would like to propose the following in addition to all the above listed points. This is largely in an effort to serve an initial set of datasets that we can use to prove out the process with as we start to move to a larger scale of DataCap allocation and distribution.

dkkapur commented 3 years ago

Update: per the conversation in the last notary governance call, https://github.com/filecoin-project/filecoin-plus-large-datasets has been set up to start testing this process out!

XnMatrixSV commented 3 years ago

@dkkapur Hi, there might be some problems with information of project details when submitting New issue of large-datasets application. Please check it. https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/9 Describe the data being stored onto Filecoin. Confirm that this is a public data set that can be retrieved by anyone on the Network What is the expected retrieval frequency for this data? For how long do you plan to keep this dataset stored on Filecoin? Is this a permanent archival or a temporary storage deal? https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/7 Share a brief history of your project and organization. What is the primary source of funding for this project? What other projects/ecosystem stakeholders is this project associated with?

dkkapur commented 2 years ago

This topic continues to evolve as part of the broader LDN theme / path to DataCap. As such, closing out this issue for now.