filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
109 stars 62 forks source link

[DataCap Application] Bearded LDN-03 Seal Storage Technology #955

Closed scharfstein closed 1 year ago

scharfstein commented 2 years ago

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

NOTE: this is LDN app 3 of 3. Total data cap requested for this project is 10 PiB.
LDN app 1 of 3 can be found here: https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/326
LDN app 2 of 3 can be found here: https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/327

This new LDN application is in response to this comment: https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/327#issuecomment-1191761175

and @dkkapur's recommendation to open a new LDN application.

*** The rest of this information is from the original LDN applications ***

Our Customer is world-class research institution that develops and inspires knowledge-based solutions and educates future leaders for just and prosperous societies on a healthy planet. The data set involved in this project is a 1.15 PiB public climate data set.

The project is a pilot to prove out the value propositions of decentralized data storage.  This DataCap application is being submitted by Seal Storage on behalf of our Customer, who wishes to remain confidential for the duration of this pilot project. Project details can be made available under an NDA.

We kicked off the project on March 8, 2022 with ingestion estimated to begin in mid March 2022 via download from the cloud. The data set is public climate data. 

Due to the perceived risk associated with cryptocurrency projects, our Customer feels it is best to delay public announcement of our collaboration until we have successfully completed the project.

Seal is a carbon-neutral, decentralized cloud storage provider. Seal's technical leadership brings decades of experience from traditional enterprise storage companies including Seagate and Oracle, as well as world-class experience on the Filecoin Network. Today, Seal operates data centers across the US and Canada with enterprise-grade infrastructure and data policies.

What is the primary source of funding for this project?

Seal is funding the project.

What other projects/ecosystem stakeholders is this project associated with?

None at this time.

Use-case details

Describe the data being stored onto Filecoin

The data sets is public climate data.

Where was the data in this dataset sourced from?

Our Customer has a copy in Google cloud.

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

Yes. The dataset is the CMIP6 Zarr store described here
https://pangeo-data.github.io/pangeo-cmip6-cloud/overview.html
The data are fully public in Google Cloud Storage at: gs://cmip6

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

Confirmed. This is a public data set.

What is the expected retrieval frequency for this data?

Archival is primary. The data will be accessed by external collaborators and Researchers.

For how long do you plan to keep this dataset stored on Filecoin?

Three years, at least.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

Six (6) full replicas shall be stored across six (6) cities, three (3) countries, two (2) continents
Each copy is 1.15 PiB

How will you be distributing your data to storage providers? Is there an offline data transfer process?

Seal Storage has dual 100 Gbps internet connections. SPs will download data from Seal. Offline data transfer may be possible.

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

We are currently discussing capabilities and performing due diligence with several SPs and have chosen three SPs for this project. We chose these based on their current storage capacity, compute capabilities, enterprise-grade DCs and bandwidth.

How will you be distributing deals across storage providers?

W3b Cloud [USA] / 1.15 PiB
PikNik [San Diego, USA] / 1.15 PiB
Holon [Sydney, AUS] / 1.15 PiB
ElioVP [Antwerp, Belgium and Utrecht, Netherlands] / 1.15 PiB
Seal Montreal, Canada / 1.15 PiB
Seal Las Vegas, USA / 1.15 PiB

Seal must also keep a full hot copy for our Customer.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Yes, we have the resources/funding to begin making deals once we receive DataCap. 

We currently have the support we need.
dannyob commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacecm5bj2xgcsmsugnnehvpp6n2h66g7p3paig2ikdcjtfbhgtekhn4

Address

f1jeautnpgr6js6uqtm4aa6mf6dvhn3unbdcvpbpy

Datacap Allocated

952.00TiB

Signer Address

f1k6wwevxvp466ybil7y2scqlhtnrz5atjkkyvm4a

Id

dd472725-98f4-4dae-9cfa-0804ba225051

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecm5bj2xgcsmsugnnehvpp6n2h66g7p3paig2ikdcjtfbhgtekhn4

dkkapur commented 1 year ago

@dannyob thank you!

dkkapur commented 1 year ago

this notary is effectively out of DataCap. @salstorage can you confirm we can close this application out?

salstorage commented 1 year ago

this notary is effectively out of DataCap. @salstorage can you confirm we can close this application out?

Thank you. Yes please go ahead and close this application @dkkapur

dkkapur commented 1 year ago

Thanks. Closing this out.

large-datacap-requests[bot] commented 1 year ago

Thanks for your request! :exclamation: We have found some problems in the information provided. We could not find Organization Name field in the information provided We could not find Website \/ Social Media field in the information provided We could not find Total amount of DataCap being requested (between 500 TiB and 5 PiB) field in the information provided We could not find Weekly allocation of DataCap requested (usually between 1-100TiB) field in the information provided We could not find On-chain address for first allocation field in the information provided We could not find Data Type of Application field in the information provided

Please, take a look at the request and edit the body of the issue providing all the required information.