filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application] Antarctic LDN-04-W3b Seal Storage Technology #315

Closed scharfstein closed 1 year ago

scharfstein commented 2 years ago

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

The is a 10 PiB project to prove out the value propositions of decentralized data storage.  A set of 10 DataCap Applications has been submitted by Seal Storage on behalf of our Customer, who wishes to remain confidential for the duration of this project. Our Customer is a world-class scientific research organization and the data sets are outputs of scientific experiments.

The Customer has been working with large data sets (PiBs) for decades and is interested in pursuing the Filecoin Network as a solution to some of their exabyte-scale storage problems. This is why starting with a 10 PiB pilot makes sense to them as it represents a small portion of their complete archive. Due to the perceived risk associated with cryptocurrency projects, our Customer feels it is best to delay public announcement of our collaboration until we have successfully completed the project.

Seal is a carbon-neutral, decentralized cloud storage provider. Seal's technical leadership brings decades of experience from traditional enterprise storage companies including Seagate and Oracle, as well as world-class experience on the Filecoin Network. Today, Seal operates data centers across the US and Canada with enterprise-grade infrastructure and data policies.

What is the primary source of funding for this project?

Seal is funding the project.

What other projects/ecosystem stakeholders is this project associated with?

The main stakeholder (aka customer) for this project is a world-class scientific research organization who wishes to remain confidential during the pilot project. Our customer has been working with PiBs of data for decades and views decentralized data storage as an exciting platform that could yield many benefits. The purpose of this project is to prove out the concept of decentralized data storage for broader use within the customer's organization.

Use-case details

Describe the data being stored onto Filecoin

The data sets are the original outputs of scientific experiments.

Where was the data in this dataset sourced from?

The data sets have been created by large-scale scientific experiments.

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

Yes - we have a link to sample data. We have disclosed this data for verification to several notaries and SPs.

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

For the purpose of the project and confidentiality, the initial 10 PiB data set should not be publicly available. The current data set is not usable by the general public as it requires considerable resources to post-process. Our customer says that it is “secure through obfuscation.” However, for the purposes of the project, the 10 PiB should not be publicly available.

A goal of the project is for Seal to work with our customer to produce a data set for public use.

What is the expected retrieval frequency for this data?

Archival is primary purpose for storing the data. It is accessed roughly twice per year.

For how long do you plan to keep this dataset stored on Filecoin?

Three years, at least.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

We plan to store five copies of the 10 PiB data set [total of 50 PiB datacap] in five cities, in three countries and across two continents.

How will you be distributing your data to storage providers? Is there an offline data transfer process?

Seal Storage has dual 100 Gbps internet connections. SPs will download data from Seal. Offline data transfer may be difficult due to the size of the data set. 

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

We are currently discussing capabilities and performing due diligence with several SPs and have chosen five SP for this project. We chose these based on their current storage capacity, compute capabilities, eneterprise-grade DCs and bandwidth.

Note that we have chosen a "primary" SP. This SP will receive two 5 PiB datacap allocations and will help provide compute for project and Customer objectives.

How will you be distributing deals across storage providers?

DLTX [primary], 10 PiB
Holon, 5 Pib
W3b Cloud, 5 PiB
ElioVP, 5 PiB
PikNik, 5 PiB
Seal, 20 PiB

As per guidance from Filecoin Foundation, Seal has submitted 10 datacap applications for this project for a total of 50 PiB in datacap.  The overall distribution is as described above. 

This 5 PiB datacap is 100 % allocated to W3b Cloud

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Yes, we have the resources/funding to begin making deals once we receive DataCap. 

We currently have the support we need.
large-datacap-requests[bot] commented 2 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 2 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 2 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 2 years ago

DataCap Allocation requested

Multisig Notary address


Client address


DataCap allocation requested


dkkapur commented 2 years ago

As outlined here, paging the 5 notaries to sign the initial allocation to unblock this client: @dannyob @Fenbushi-Filecoin @MegTei @swatchliu @neogeweb3.

dannyob commented 2 years ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network




Datacap Allocated


Signer Address


You can check the status of the message here:

dkkapur commented 2 years ago

Tx looks fine, removing error label.

MegTei commented 2 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network




Datacap Allocated


Signer Address


You can check the status of the message here:

large-datacap-requests[bot] commented 2 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 2 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

scharfstein commented 1 year ago

I wanted to let the Filecoin Community know that I am no longer involved in managing this LDN application.

@salstorage is the primary point of contact.

salstorage commented 1 year ago

As per direction with FF we are submitting a new application for this dataset under the EFil+ program.

Please close this application as inactive @kevzak @pwrepo @simonkim0515 @dkkapur @galen-mcandrew