filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
109 stars 62 forks source link

Archived July 2024

With the deprecation of the large dataset notary program, we are archiving this repo. For Fil+ Allocator Governance discussions, see this Governance repo. For the registry of active allocators, see this Allocator Registry repo

To view other updated reference links for the Fil+ program, please see:

Filecoin Plus for large datasets

Filecoin Plus is a community program that aims to increase Filecoin's network by becoming the decentralized storage network for humanity's most important information. As the network continues to grow, we as a community need to make sure to maintain a civil online engagement. To learn more about acceptable community behavior please check the Filecoin Community Code of Conduct.

This repo serves as the hub for client applications for DataCap at a large scale - currently defined as > 100 TiB of DataCap. If you wish to generally learn more about Filecoin Plus or apply for less than 500 TiB of DataCap, check out the following resources:

The process outlined below for clients looking to apply for a large amount of DataCap was initially proposed in Issue #94 - Onboarding projects with large DataCap requirements. Through an initial pilot phase and learnings/feedback collected over time, we are currently on the third iteration of the Large Dataset Notary (LDN) process. See #217, #328, #509 for additional details.

The main difference between this process and applying for DataCap directly from a notary (via filplus.storage) is that this process is (1) significantly more public, (2) DataCap is allocated from a large multisig Notary address, and (3) DataCap is allocated in tranches.

Current Scope

Based on conversations in various issues and governance calls, here is the current scope of the Large Dataset Notary (LDN) program. You can find relevant issues, as well as links to governance call recordings in the Notary Governance repo. Please note that this is still an evolving conversation, so the scope is subject to change. If you would like to participate in this conversation or have feedback, please let us know! You can start a discussion topic in the Notary Governance repo, in the fil-plus public Slack channel, or in an upcoming Governance call.

Clients can currently apply for a Large Dataset Notary which can grant them between 100 TiB and 5 PiB of total DataCap per application.

In order for a client and their dataset to be eligible:

If you are a client who is interested in applying for a large DataCap allocation via an LDN, please see the steps outlined below.

Applying for a large DataCap allocation

Application flow:

When clients use up > 75% of the prior DataCap allocation, a request for additional DataCap in the form of the next tranche is automatically kicked off (via the'subsequent allocation bot'). Notaries have access to on-chain data required to verify that the client is operating in good faith, in accordance with the principles of the program, and in line with their allocation strategy outlined in the original application. 2 notaries need to approve the next tranche of DataCap to be allocated to the client. The same notary cannot sign off on immediately subsequent allocations of DataCap, i.e., you need at minimum 4 unique notaries to support your application on an ongoing basis to receive multiple tranches of DataCap.

Application flow labels

The following labels indicate the statues of LDN applications. The most recent version of these labels were released on April 15, 2023. More comprehensive release notes can be found in this blog.

DataCap tranche size calculations

Granting DataCap to the client

The bot will post a comment with the following structure to kick off a request for DataCap allocation:

## DataCap Allocation requested

#### Multisig Notary address
> <addr1>

#### Client address
> <addr2>

#### DataCap allocation requested
> XTiB

#### Id
> Id

This initiates a proposal to the multisig Notary to grant the associated amount of DataCap to the client address. Other notaries will now see this in the Filecoin Plus Registry app where they can approve or decline the request.

In order to approve the request in the Fil+ Registry App, Notaries need to sign in with their Ledger. During this initial authorization, the app will check if the Ledger address is an approved signer on the large dataset multisig notary addresses (previously, the Organization). Notaries can then action and sign multiple large requests in a row, without needing to re-auth for each multisig.

All notaries signing onto the LDN multisig are encouraged to track the client's use of previous DataCap allocations using on-chain information, data available on chain browsers, or on Fil+ specific dashboards like https://filplus.d.interplanetary.one/ or https://filplus.info/.

FVM Smart Contracts

Smart contracts can acquire DataCap just like any regular client. To do so, simply enter the f410 address of the smart contract that requires DataCap as the client address when making a request.

The process outlined above is for larger amounts of Datacap > 500 TiBs. For a smart contract's first DataCap allocation, we recommend using auto-verifier Verify.glif.io to get 32 GiB of DataCap, as specified here.

It's important to note that DataCap allocations are a one-time credit for a Filecoin address and cannot be transferred between smart contracts. If you need to redeploy the smart contract, you must request additional DataCap. To improve this experience, we are developing an FRC to allow DataCap to be held between redeployments.

Current status

New applications are being accepted at this time, though please expect that the process will likely have some issues as we continue to test and improve the functionality of the process.

Retrieval Guidelines for Data Clients

  1. Fil+ data clients are advised to meticulously choose SPs that align with their specific data retrieval requirements.
  2. Fil+ open dataset clients commit to ensuring the retrievability of open datasets by storing with SPs that serve HTTP retrievals with either booster-http or their custom tooling.
  3. Fil+ clients can enhance their reputation by holding their SPs accountable for retrievability. This may streamline acquisition of additional DataCap in the future.
  4. Fil+ Notaries will consider the client’s track record on retrievability as part of their due diligence.
  5. Data clients and SPs should be aware of the risk of network attacks, and mitigate these risks via rate limiting tools (e.g. set a max requests per second limit).
  6. Data clients and SPs should consider implementing a throttling limit that determines the maximum bandwidth a single retrieving client can consume at any given time.
  7. Private data clients (E Fil+) should store with SPs that provide a level of retrievability consistent with the data clients’ retrieval needs indicated on the DataCap application.
  8. In the event of a large retrieval size, SPs should leverage tooling for load balancing to protect themselves.
  9. Multiple SPs can share a single unsealed copy of data with the same CID. This practice is deemed acceptable as it optimizes time and resource utilization.
  10. If the data client has very good (95-100%) retrievability track record via another retrieval method (GraphSync or BitSwap), then the data client may work with Notaries to get future DataCap approval on a case-by-case basis.