filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
109 stars 62 forks source link

[DataCap Application] Caribbean LDN-02 Seal Storage Technology #1282

Closed salstorage closed 11 months ago

salstorage commented 1 year ago

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

NOTE: this is LDN app 2 of 2. Total data cap requested for this project is 8.4 PiB.
LDN app 1 of 2 can be found here: https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1281
Seal is collaborating with The Center for Extreme Data Management, Analysis and Visualization (CEDMAV) of the University of Utah on a pilot project to use Filecoin for data storage for a public dataset: OpenVisus datasets (they are very “sparse”, combustion, simulations, earth, satellite etc). The data sets are about 1.4 PiB. Seal and CEDMAV are also exploring specific use-cases and how the Filecoin Network can support ongoing research.

Seal is a carbon-neutral, decentralized cloud storage provider. Seal's technical leadership brings decades of experience from traditional enterprise storage companies including Seagate and Oracle, as well as world-class experience on the Filecoin Network. Today, Seal operates data centers across the US and Canada with enterprise-grade infrastructure and data policies.

What is the primary source of funding for this project?

Seal is funding the project.

What other projects/ecosystem stakeholders is this project associated with?

The main stakeholder for this project is the University of Utah CEDMAV group. Our customer views decentralized data storage as an exciting platform that could yield many benefits for future large datasets.

Use-case details

Describe the data being stored onto Filecoin

Project Caribbean consists of a 1.4 PiB verified PUBLIC data set belonging to The Center for Extreme Data Management Analysis and Visualization at the University of Utah (CEDMAV, http://cedmav.org/). The data is OpenVisus data sets (they are very “sparse”, combustion, simulations, earth, satellite etc).

Where was the data in this dataset sourced from?

The Center for Extreme Data Management, Analysis and Visualization (CEDMAV) of the University of Utah

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

https://drive.google.com/drive/folders/1j6p8peLJJ9tbNhF5mD3sPBUKjFrw5UUy?usp=sharing

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

Confirmed - this is a public dataset

What is the expected retrieval frequency for this data?

The data will be accessed by external collaborators and Researchers.
Retrieval would be daily as and when needed, Seal Storage will keep a unsealed copy for retrieval purposes

For how long do you plan to keep this dataset stored on Filecoin?

Three year term

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

We plan to store six copies of the 1.4 PiB data set [total of 8.4 PiB] in four different cities, in three different countries and across two continents.

How will you be distributing your data to storage providers? Is there an offline data transfer process?

Seal Storage has dual 100 Gbps internet connections. SPs will download data from Seal.
Seal Storage will prepare and CAR the files into 32GB chunks for distribution 

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

We are currently meeting and discussing capabilities with several other SPs. Due to the size of this pilot, our SPs have requested DataCap before committing to the project.
We plan to choose enterprise-grade SPs for this project and will complete our due diligence post DataCap approval

How will you be distributing deals across storage providers?

1 copy = 1.4 PiB

DLTX 1 copy in Omaha, Nebraska, USA
Ghostbytes 1 copy in Philadelphia, USA
DSS 1 copy in Sydney, Australia
Telnyx 1 copy in USA, multiple locations
Seal Storage 1 copy in Las Vegas, NV USA 
Seal Storage 1 copy in Montreal, Quebec, Canada 

Seal Storage must also keep a full hot copy unsealed for our Customer. 

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Once we receive DataCap, we will begin making deals as soon as customer data is transferred to Seal Storage November 2022
We currently have the support we need.
salstorage commented 1 year ago

Due to termination of sectors, Remove the following SP and Miner ID from application DLTX 1 copy in Omaha, Nebraska, USA Ghostbytes 1 copy in Philadelphia, USA Telnyx 1 copy multiple locations USA

Adding the following SP to application GreaterHeat Dallas US Location f02361686 f02345061

salstorage commented 1 year ago

f01886690 f01886710 SEAL STORAGE TECHNOLOGY INC, Las Vegas USA f01274011 f01746964 f01919423 f01938357 f02238775 DSS Australia f01923553 f01923554 f01923555 f01923556 f02181705 SEAL STORAGE TECHNOLOGY INC, Montreal Canada f01987994 f02202753 VoGo Digital Labs, S Korea f02229460 f02832654 Aligned USA, Mid West / Ohio, USA f02361686 f02345061 GreaterHeat Dallas Tx USA

github-actions[bot] commented 12 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

github-actions[bot] commented 11 months ago

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!

-- Commented by Stale Bot.

aggregation-and-compliance-bot[bot] commented 11 months ago
Client f01981070 does not follow the datacap usage rules. More info here. This application has been failing the requirements for 7 days. Please take appropiate action to fix the following DataCap usage problems. Criteria Treshold Reason
Percent of used DataCap stored with top provider < 75 The percent of Data from the client that is stored with their top provider is 100%. This should be less than 75%