filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
109 stars 62 forks source link

[DataCap Application] Ghost Byte Inc - NOAA Global Forecast System (GFS) [2/3] #1369

Closed Trevor-K-Smith closed 1 year ago

Trevor-K-Smith commented 1 year ago

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

image

DP Info

Client Info

Data Info

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

Ghost Byte Inc is a storage provider seeking to onboard data to meet the high demand of FIL+ for itself and its partners. Ghost Byte has a history of actively participating in NA weekly calls, helping community members on the slack channel, testing beta software with feedback, and overall ongoing support in the community of filecoin. Ghost Byte works with industry partners to assist the growth in web3 adoption.
Ref: https://www.youtube.com/watch?v=6PejYUlN0AM

What is the primary source of funding for this project?

Ghost Byte Inc

What other projects/ecosystem stakeholders is this project associated with?

Ghost Byte Inc

Use-case details

Describe the data being stored onto Filecoin

NOTE - Upgrade NCEP Global Forecast System to v16.3.0 - Effective November 29, 2022 See notification [HERE ](https://www.weather.gov/media/notification/pdf2/scn22-104_gfs.v16.3.0.pdf) The Global Forecast System (GFS) is a weather forecast model produced by the National Centers for Environmental Prediction (NCEP). Dozens of atmospheric and land-soil variables are available through this dataset, from temperatures, winds, and precipitation to soil moisture and atmospheric ozone concentration. The entire globe is covered by the GFS at a base horizontal resolution of 18 miles (28 kilometers) between grid points, which is used by the operational forecasters who predict weather out to 16 days in the future. Horizontal resolution drops to 44 miles (70 kilometers) between grid point for forecasts between one week and two weeks. The NOAA Global Forecast Systems (GFS) Warm Start Initial Conditions are produced by the National Centers for Environmental Prediction Center (NCEP) to run operational deterministic medium-range numerical weather predictions.
The GFS is built with the GFDL Finite-Volume Cubed-Sphere Dynamical Core (FV3) and the Grid-Point Statistical Interpolation (GSI) data assimilation system.
Please visit the links below in the Documentation section to find more details about the model and the data assimilation systems. The current operational GFS is run at 64 layers in the vertical extending from the surface to the upper stratosphere and on six cubic-sphere tiles at the C768 or 13-km horizontal resolution. A new version of the GFS that has 127 layers extending to the mesopause will be implemented for operation on February 3, 2021. These initial conditions are made available four times per day for running forecasts at the 00Z, 06Z, 12Z and 18Z cycles, respectively. For each cycle, the dataset contains the first guess of the atmosphere states found in the directory ./gdas.yyyymmdd/hh-6/RESTART, which are 6-hour GDAS forecast from the last cycle, and atmospheric analysis increments and surface analysis for the current cycle found in the directory ./gfs.yyyymmdd/hh, which are produced by the data assimilation systems.

Where was the data in this dataset sourced from?

This data is being replicated from AWS Opendata to Filecoin. The dataset being replicated is NOAA Global Forecast System (GFS). This dataset is 1.46 PiB Total Size. The data will be replicated a total of 10 times for a total datacap request of 15 PiB.
Ref: https://registry.opendata.aws/noaa-gfs-bdp-pds/

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

https://www.dropbox.com/scl/fo/t1j1e3jvqulokzkxmqcnu/h?dl=0&rlkey=l6d519p36u2zg9m288h51pgu5

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

Public - https://registry.opendata.aws/noaa-gfs-bdp-pds/

What is the expected retrieval frequency for this data?

1-3 Year

For how long do you plan to keep this dataset stored on Filecoin?

540 Days, subject to renewal when the time comes.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

Global partners. No restrictions, will spread out as much as possible.

How will you be distributing your data to storage providers? Is there an offline data transfer process?

Data will be send over boostd to participating storage providers. Otherwise, offline deals can be done for those with special requirements.

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

Storage Providers will be found in the active community slack, partners met at events, and online data sources. 1 per actor, 2 per organization, spread as evenly across the globe a possible. Total 10 replications. SP's will be confirmed ahead of replication of cars that they intend to allow the cars to be accessible and retrievable. Car files for this replication will not be enforced to keep unsealed sectors as this is disaster recovery data, and retrievals will be low. 

How will you be distributing deals across storage providers?

Singularity will be used to serve up the deals and track the progress of each CAR file being replicated. 1 per actor, 2 per organization, spread as evenly across the globe a possible. Total 10 replications.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Yes, we have the resources to get started right away. We do not need help at this time. Thank you!

image

large-datacap-requests[bot] commented 1 year ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

simonkim0515 commented 1 year ago

Datacap Request Trigger

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

250TiB

Client address

f15pqkqz7vrr3cempgtafznmwdks65sjai6e53rrq

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f1e4kpjyvcrlrqbyn7rv5np5klijstrhhjejiznvy

DataCap allocation requested

125TiB

Id

4d5e2e23-d703-437d-987d-0912cbb3049c

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

There is no previous allocation for this issue.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

large-datacap-requests[bot] commented 1 year ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
Trevor-K-Smith commented 1 year ago

@simonkim0515 @raghavrmadya

Can i kindly get this application reviewed? Thanks

large-datacap-requests[bot] commented 1 year ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
large-datacap-requests[bot] commented 1 year ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
data-programs commented 7 months ago
KYC

This user’s identity has been verified through filplus.storage