filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application] Open Atmospheric Research Datasets By NCAR #2119

Closed FroghubMan closed 9 months ago

FroghubMan commented 1 year ago

Data Owner Name

NCAR

What is your role related to the dataset

Data Preparer

Data Owner Country/Region

United States

Data Owner Industry

Environment

Website

https://www.ucar.edu/terms-of-use/data

Social Media

https://www.ucar.edu/terms-of-use/data

Total amount of DataCap being requested

9PiB

Expected size of single dataset (one copy)

1P

Number of replicas to store

9

Weekly allocation of DataCap requested

1PiB

On-chain address for first allocation

f17zbocai3pf7hks6eomldvnmkdvgn25zojia3agq

Data Type of Application

Public, Open Dataset (Research/Non-Profit)

Custom multisig

Identifier

No response

Share a brief history of your project and organization

FrogHub files is a platform that will provide users with free storage, allowing users to enjoy free storage at the same time. Facilitating the Filecoin network to store more and more valuable public data sets.

About the FrogHub data set

FrogHub files is a user-oriented cloud storage platform. Any user with data storage needs can use the platform tools free of charge.

The Filecoin Network provides reliable, secure, and affordable decentralized storage, and FrogHub files wants to make it easy for users to store their data in the filecoin network.

Attracting traditional cloud and object-based storage users to the Filecoin network and benefiting from it is a challenge. FrogHub, as developers in the Felicoin ecosystem, needs to face this challenge together. We've been building useful tools to make it easier for users to store data on the Filecoin network.

Froghub files integrates a set of tools, including storage bucket, custom gateway, data encryption service KMS, static web deployment (under development), open API, to provide storage services compatible with cloud storage and object storage, providing better user experience and attracting more users.
FrogHub has always defined itself as a tool developer and infrastructure builder in the Filecoin ecosystem. In 2019, we started to focus on technical solutions and development based on the IPFS protocol and Filecoin network. We have been working hard to become a qualified builder in the filecoin ecosystem.

Our team is a very pure development team, more than 90% of which are developers, more than half of whom have more than 5 years of development experience in communication, Internet, blockchain and other industries. We hope that we can gain users' recognition by exporting useful tools and platforms.

In order to contribute to the filecoin community, we have developed the open source sector repair tool Filecoin-Sealer-Recover and the nft free authoring platform NFT-Creator.
In addition, we plan to provide a sector browser for the community in 2023 and build the liquidity pledge platform STFIL on FVM.

See the links below for details.
- github: https://github.com/orgs/froghub-io
- twitter: https://twitter.com/froghub_io

Is this project associated with other projects/ecosystem stakeholders?

No

If answered yes, what are the other projects/ecosystem stakeholders

No response

Describe the data being stored onto Filecoin

This application including 5 AWS open datasets for atmospheric research by NCAR(National Center for Atmospheric Research):

1. Community Earth System Model Large Ensemble (CESM LENS)
https://registry.opendata.aws/ncar-cesm-lens/
The Community Earth System Model (CESM) Large Ensemble Numerical Simulation (LENS) dataset includes a 40-member ensemble of climate simulations for the period 1920-2100 using historical data (1920-2005) or assuming the RCP8.5 greenhouse gas concentration scenario (2006-2100), as well as longer control runs based on pre-industrial conditions.

2. Community Earth System Model v2 Large Ensemble (CESM2 LENS)
https://registry.opendata.aws/ncar-cesm2-lens/
The US National Center for Atmospheric Research partnered with the IBS Center for Climate Physics in South Korea to generate the CESM2 Large Ensemble which consists of 100 ensemble members at 1 degree spatial resolution covering the period 1850-2100 under CMIP6 historical and SSP370 future radiative forcing scenarios.

3. Community Earth System Model v2 ARISE (CESM2 ARISE)
https://registry.opendata.aws/ncar-cesm2-arise/
Data from ARISE-SAI Experiments with CESM2

4. NA-CORDEX - North American component of the Coordinated Regional Downscaling Experiment
https://registry.opendata.aws/ncar-na-cordex/
The NA-CORDEX dataset contains regional climate change scenario data and guidance for North America, for use in impacts, decision-making, and climate science. The NA-CORDEX data archive contains output from regional climate models (RCMs) run over a domain covering most of North America using boundary conditions from global climate model (GCM) simulations in the CMIP5 archive.

5. CAM6 Data Assimilation Research Testbed (DART) Reanalysis: Cloud-Optimized Dataset
https://registry.opendata.aws/ncar-dart-cam6/
This is a cloud-hosted subset of the CAM6+DART (Community Atmosphere Model version 6 Data Assimilation Research Testbed) Reanalysis dataset. These data products are designed to facilitate a broad variety of research using the NCAR CESM 2.1 (National Center for Atmospheric Research's Community Earth System Model version 2.1), including model evaluation, ensemble hindcasting, data assimilation experiments, and sensitivity studies. They come from an 80 member ensemble reanalysis of the global troposphere and stratosphere using DART and CAM6. The data products represent states of the atmosphere consistent with observations from 2011 through 2019 at 1 degree horizontal resolution and weekly frequency.

Where was the data currently stored in this dataset sourced from

AWS Cloud

If you answered "Other" in the previous question, enter the details here

No response

If you are a data preparer. What is your location (City and Country)

No response

If you are a data preparer, how will the data be prepared? Please include tooling used and technical details?

No response

If you are not preparing the data, who will prepare the data? (Provide name and business)

No response

Has this dataset been stored on the Filecoin network before? If so, please explain and make the case why you would like to store this dataset again to the network. Provide details on preparation and/or SP distribution.

I found those valuable public datasets from closed applications that had been passed but not actually implemented. The SP I'm working with has strong storage needs, so I want to get started quickly on storing this data.

Please share a sample of the data

s3://ncar-cesm-lens/    82.7 TiB
s3://ncar-cesm2-lens/   309.3 TiB
s3://ncar-cesm2-arise/  597.2 TiB
s3://ncar-na-cordex/    13.2 TiB
s3://ncar-dart-cam6/    2.5 TiB

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

Yearly

For how long do you plan to keep this dataset stored on Filecoin

2 to 3 years

In which geographies do you plan on making storage deals

Greater China, Asia other than Greater China, North America, Europe

How will you be distributing your data to storage providers

Cloud storage (i.e. S3), Shipping hard drives

How do you plan to choose storage providers

Slack, Big Data Exchange, Partners

If you answered "Others" in the previous question, what is the tool or platform you plan to use

No response

If you already have a list of storage providers to work with, fill out their names and provider IDs below

| MinerID   | City       | Continent | 
| --------- | ---------- | --------- | 
| f01811024 | HongKong   | CN        | 
| f0827006  | Tokyo      | Japan     | 
| f02227726 | HongKong   | CN        | 
| f02195153 | Tokyo      | Japan     | 
| f02098006 | Krabi      | Thailand  | 
| f02223012 | Hebei      | CN        | 
| f02235154 | HongKong   | CN        | 
| f02085722 | Jiangmen   | CN        |

How do you plan to make deals to your storage providers

No response

If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline

Yes

large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

FroghubMan commented 1 year ago

Hello @Sunnyiscoming , We would like to quickly start the data processing and storage process. Can you approve it for me?

kevzak commented 1 year ago

Hi @FroghubMan - can you confirm Entities involved here? Each ID is different SP?

MinerID City Continent
f01811024 HongKong CN
f0827006 Tokyo Japan
f02227726 HongKong CN
f02195153 Tokyo Japan
f02098006 Krabi Thailand
f02223012 Hebei CN
f02235154 HongKong CN
f02085722 Jiangmen CN
FroghubMan commented 1 year ago

Hi @kevzak ,I can fully identify these entities, but I cannot disclose them publicly. Some miners belong to the same SP.

ghost commented 1 year ago

Hi @FroghubMan seeing 4 applications come in with same list of SPs.

Per the https://github.com/filecoin-project/notary-governance/issues/922 for Open, Public Dataset applicants, please complete the following Fil+ registration form to identify yourself as the applicant and also please add the contact information of the SP entities you are working with to store copies of the data.

This information will be reviewed by Fil+ Governance team to confirm validity and then the application will be triggered for notary review. I know you said you could just fake it, but please provide real information about the entities that are essentially unknown to anyone in the community. Let us know if you have any questions.

FroghubMan commented 1 year ago

Hi @Sunnyiscoming, I have completed the Fil+ registration form. Can you review it for me and start the data storage process asap? Thank you so much.

ghost commented 1 year ago

@FroghubMan thank you for sharing SP Entity information:

f01811024 chimsen HongKong f0827006 FrogStorage Tokyo f02227726 Three Clouds Hong Kong f02223012 Person Hebei f02085722 Chimsen Jiangmen

FroghubMan commented 1 year ago

Hi @Sunnyiscoming , can you support my applications?

ghost commented 1 year ago

Just to re-confirm the process here @FroghubMan - after any allocations we will ask notaries to:

dkkapur commented 1 year ago

If you are a data preparer. What is your location (City and Country) No response

If you are a data preparer, how will the data be prepared? Please include tooling used and technical details? No response

why were these questions not answered?

FroghubMan commented 1 year ago

Just to re-confirm the process here @FroghubMan - after any allocations we will ask notaries to:

  • Confirm SP IDs and locations match your plan
  • Confirm data is retrievable and data sample matches said datasets

sure

FroghubMan commented 1 year ago

If you are a data preparer. What is your location (City and Country) No response

If you are a data preparer, how will the data be prepared? Please include tooling used and technical details? No response

why were these questions not answered?

Sorry for that. We are in China. We download the dataset on aws through large bandwidth, and use singularity to generate car files, part of which is transmitted online, and part of which is distributed to sp by posting hard drives offline

Sunnyiscoming commented 1 year ago

Datacap Request Trigger

Total DataCap requested

9PiB

Expected weekly DataCap usage rate

1TiB

Client address

f17zbocai3pf7hks6eomldvnmkdvgn25zojia3agq

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f17zbocai3pf7hks6eomldvnmkdvgn25zojia3agq

DataCap allocation requested

460.79TiB

Id

503320d8-726a-4e3a-88c7-159f13f09641

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

No application info found for this issue on https://filplus.d.interplanetary.one/clients.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

BDE-io commented 1 year ago
  1. i infer from this thread the applicant has completed the Fil+ registration form
  2. the applicant has provided information about the distribution of this dataset, and at this point, it looks like the replicas will be well distributed
  3. the applicant has provided information on its data preparation process, which sounds realistic and reasonable

therefore, happy to sign this first tranche. advise notaries to monitor distribution of data upon successful storage

BDE-io commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacecjblygabfanriflatm2jfk3lplryemax6if5rulzadedn7hduhv2

Address

f17zbocai3pf7hks6eomldvnmkdvgn25zojia3agq

Datacap Allocated

460.79TiB

Signer Address

f1dvvrjur7tstos2paxdsvdpljqx53c74wsp5cl4q

Id

503320d8-726a-4e3a-88c7-159f13f09641

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecjblygabfanriflatm2jfk3lplryemax6if5rulzadedn7hduhv2

FroghubMan commented 1 year ago

We are preparing the data.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!

-- Commented by Stale Bot.

FroghubMan commented 1 year ago

Hi @kevzak @Sunnyiscoming . We already have the data ready for this application, can you reopen it for me?

FroghubMan commented 1 year ago

Hello @kevzak @Sunnyiscoming ,could you help me to reopen this application ,so that we can continue the work, thank you!

FroghubMan commented 1 year ago

@galen-mcandrew @raghavrmadya @Kevin-FF-USA

FroghubMan commented 1 year ago

Hello . Any update?

1ane-1 commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

No application info found for this issue on https://filplus.d.interplanetary.one/clients.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

1ane-1 commented 1 year ago

Support according to above information, and this is the first round we will check your retrieval and sps information later. Good luck.

1ane-1 commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacecw44x4hqzzam5rjbhy43uit6og3d23wj5i54fxzwsjxinhg5eu4i

Address

f17zbocai3pf7hks6eomldvnmkdvgn25zojia3agq

Datacap Allocated

460.79TiB

Signer Address

f1mdk7s2vntzm6hu35yuo6vjubtrpfnb2awhgvrri

Id

503320d8-726a-4e3a-88c7-159f13f09641

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecw44x4hqzzam5rjbhy43uit6og3d23wj5i54fxzwsjxinhg5eu4i

kevzak commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

No active deals found for this client.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

FroghubMan commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

No active deals found for this client.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

FroghubMan commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

No active deals found for this client.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

FroghubMan commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

No active deals found for this client.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

FroghubMan commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

No active deals found for this client.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

FroghubMan commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

No active deals found for this client.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

FroghubMan commented 1 year ago

@Sunnyiscoming Can you help me take a look at this issue? I've been unable to trigger reports and the robot.

FroghubMan commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

No active deals found for this client.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

FroghubMan commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard.

FroghubMan commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard.

sxxfuture-official commented 1 year ago

Reports checked, LGTM.

sxxfuture-official commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacearp7cbh4b6xgovgih7sik76mil52pg2cwgnkmmhvzrama5nor2cw

Address

f17zbocai3pf7hks6eomldvnmkdvgn25zojia3agq

Datacap Allocated

460.79TiB

Signer Address

f1foiomqlmoshpuxm6aie4xysffqezkjnokgwcecq

Id

503320d8-726a-4e3a-88c7-159f13f09641

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacearp7cbh4b6xgovgih7sik76mil52pg2cwgnkmmhvzrama5nor2cw

1ane-1 commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacebuuuzcud7ozazpca4ddf4s6465qlxtqmvrubjm5r3yezu75wr4ie

Address

f17zbocai3pf7hks6eomldvnmkdvgn25zojia3agq

Datacap Allocated

460.79TiB

Signer Address

f1mdk7s2vntzm6hu35yuo6vjubtrpfnb2awhgvrri

Id

503320d8-726a-4e3a-88c7-159f13f09641

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebuuuzcud7ozazpca4ddf4s6465qlxtqmvrubjm5r3yezu75wr4ie