filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
109 stars 62 forks source link

[DataCap Application] GEFS Re-forecast #2011

Closed Killianxuan closed 11 months ago

Killianxuan commented 1 year ago

Data Owner Name

NOAA

What is your role related to the dataset

Data Preparer

Data Owner Country/Region

United States

Data Owner Industry

Environment

Website

http://www.noaa.gov/

Social Media

https://twitter.com/NOAA
https://www.facebook.com/NOAA
https://www.instagram.com/noaa

Total amount of DataCap being requested

5PiB

Expected size of single dataset (one copy)

392TiB

Number of replicas to store

10

Weekly allocation of DataCap requested

1PiB

On-chain address for first allocation

f1dsdtj54saufjswcfyp5t3feqe3i6aw5dppxe4va

Data Type of Application

Public, Open Dataset (Research/Non-Profit)

Custom multisig

Share a brief history of your project and organization

NOAA is an agency that enriches life through science. Our reach goes from the surface of the sun to the depths of the ocean floor as we work to keep the public informed of the changing environment around them. I'm a DP with some experience of distribute data.

Is this project associated with other projects/ecosystem stakeholders?

No

If answered yes, what are the other projects/ecosystem stakeholders

No response

Describe the data being stored onto Filecoin

NOAA has generated a multi-decadal reanalysis and reforecast data set to accompany the next-generation version of its ensemble prediction system, the Global Ensemble Forecast System, version 12 (GEFSv12).

Where was the data currently stored in this dataset sourced from

AWS Cloud

If you answered "Other" in the previous question, enter the details here

No response

How do you plan to prepare the dataset

IPFS, lotus, singularity

If you answered "other/custom tool" in the previous question, enter the details here

No response

Please share a sample of the data

https://registry.opendata.aws/noaa-gefs-reforecast/

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

Yearly

For how long do you plan to keep this dataset stored on Filecoin

1.5 to 2 years

In which geographies do you plan on making storage deals

Greater China, Asia other than Greater China, North America, Europe

How will you be distributing your data to storage providers

HTTP or FTP server, IPFS, Shipping hard drives

How do you plan to choose storage providers

Slack, Big Data Exchange, Partners

If you answered "Others" in the previous question, what is the tool or platform you plan to use

No response

If you already have a list of storage providers to work with, fill out their names and provider IDs below

No response

How do you plan to make deals to your storage providers

Lotus client, Singularity

If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline

Yes

large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

Sunnyiscoming commented 1 year ago

Please share more detailed information about sps you will cooperate with.

Killianxuan commented 1 year ago

@Sunnyiscoming We are contacting with sp Doco, Mond in HongKong and I'll continue finding sp in other regions. I'm checking out their technology.

Sunnyiscoming commented 1 year ago

Datacap Request Trigger

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

1PiB

Client address

f1dsdtj54saufjswcfyp5t3feqe3i6aw5dppxe4va

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f1dsdtj54saufjswcfyp5t3feqe3i6aw5dppxe4va

DataCap allocation requested

256TiB

Id

55759f5d-12bd-4974-a68b-f7c5be6ab587

Wengeding commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacec4bcuvspxifyq4ld7nnng45ydv2w37vy2ec3vkgao4baf6ofsuh2

Address

f1dsdtj54saufjswcfyp5t3feqe3i6aw5dppxe4va

Datacap Allocated

256.00TiB

Signer Address

f1txfsjmix4vlzido4dkildrnbw26owtlbslexmpa

Id

55759f5d-12bd-4974-a68b-f7c5be6ab587

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacec4bcuvspxifyq4ld7nnng45ydv2w37vy2ec3vkgao4baf6ofsuh2

AthSmith commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedrpcnh2ngg5nwwkywr3ulrloxwgvniodf4rw2rgzhttpg3tj2rzo

Address

f1dsdtj54saufjswcfyp5t3feqe3i6aw5dppxe4va

Datacap Allocated

256.00TiB

Signer Address

f1vxbqrf7rfum3n6m5u6eb4re6xj7amvsaqnzu64y

Id

55759f5d-12bd-4974-a68b-f7c5be6ab587

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedrpcnh2ngg5nwwkywr3ulrloxwgvniodf4rw2rgzhttpg3tj2rzo

AthSmith commented 1 year ago

The client contacted me by slack and shared the storage plan. I'd like to see more public data joining the network!

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

Sunnyiscoming commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

No active deals found for this client.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 2

Multisig Notary address

f02049625

Client address

f1dsdtj54saufjswcfyp5t3feqe3i6aw5dppxe4va

DataCap allocation requested

512TiB

Id

76666268-7556-4002-84e8-8f11e9bc3d92

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f02049625

Client address

f1dsdtj54saufjswcfyp5t3feqe3i6aw5dppxe4va

Rule to calculate the allocation request amount

100% weekly > 0.5PiB, requesting 0.5PiB

DataCap allocation requested

512TiB

Total DataCap granted for client so far

256TiB

Datacap to be granted to reach the total amount requested by the client (5PiB)

4.75PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
2628 2 256TiB 77.93 38.25TiB
github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

Killianxuan commented 1 year ago

OK. I'll go on allocating datacap to sps.

spaceT9 commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

⚠️ All retrieval success ratios are below 1%.

Storage Provider Distribution

⚠️ 2 storage providers have unknown IP location - f02218611, f02207907

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 3 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval report.

cryptowhizzard commented 1 year ago

Hi,

I want to retrieve this data like you stated in the LDN @Killianxuan

Your SP's don't have retrieval enabled. Why?

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

Killianxuan commented 1 year ago

@cryptowhizzard SP is in the process of software upgrade and will continue supporting retrieval in the future.

spaceT9 commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 3 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

BobbyChoii commented 1 year ago

LGTM, willing to support! Also, try to increase the number of SPs ASAM.

BobbyChoii commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacecsl33kf2fwbqkq6szc2rxsjmsrmgwnqkl27ywg7d7vtrcc4ulxye

Address

f1dsdtj54saufjswcfyp5t3feqe3i6aw5dppxe4va

Datacap Allocated

512.00TiB

Signer Address

f1irqs2gmctiv3jcdfwuch7oxvf4ixh3k4b2wc24i

Id

76666268-7556-4002-84e8-8f11e9bc3d92

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecsl33kf2fwbqkq6szc2rxsjmsrmgwnqkl27ywg7d7vtrcc4ulxye

cryptowhizzard commented 1 year ago

@BobbyChoii

Error: retrieval query for miner f02207907 failed: failed to open stream to peer: protocols not supported: [/fil/retrieval/qry/1.0.0]

Why didn't you test the retrieval of this client and signed 0.5 PB on first allocation?

herrehesse commented 1 year ago

All selfdealing, @BobbyChoii is going no DD at all.

Please act @raghavrmadya @Kevin-FF-USA @dkkapur

Killianxuan commented 1 year ago

@cryptowhizzard The miner f02207907 can be retrieved successfully.

image

@BobbyChoii Thanks for your support!

Killianxuan commented 1 year ago

@herrehesse I looked to notaries for help, and Bobby reviewed my application and responded to my request kindly and responsibly. Please do not try to blame others.

cryptowhizzard commented 1 year ago

Scherm­afbeelding 2023-07-26 om 10 13 20

2 things: A:) You are indeed right. HTTP retrieval is supported. B:) The speed is so slow that this won't work. I can't do due diligence on your data. Can you make baga6ea4seaqmbtrigksyggoxcura75xhqscgdp2jfhgbjgpnm67lhu5zeln2goq available somewhere for fast download?

Killianxuan commented 1 year ago

Maybe it need check for your network I guess.

cryptowhizzard commented 1 year ago

Maybe it need check for your network I guess.

I checked from different places on the globe, all the same speed.

B:) The speed is so slow that this won't work. I can't do due diligence on your data. Can you make baga6ea4seaqmbtrigksyggoxcura75xhqscgdp2jfhgbjgpnm67lhu5zeln2goq available somewhere for fast download?

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

Killianxuan commented 1 year ago

Asking SPs to set up a high-speed download link especially for you, which is obviously unfair to other users and will add unnecessary burden to SPs. In addition, download speed is affected by many factors, we hope you can understand.

cryptowhizzard commented 1 year ago

Hello @Killianxuan

No, i don't understand. Your data must be readily retrievable for anyone on the network. These are the rules. Secondly, i want to do due diligence on your data that you are storing.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!

clriesco commented 1 year ago

Removed stale label and reopened issue :)

cryptowhizzard commented 1 year ago

Scherm­afbeelding 2023-08-31 om 15 27 38

Cheers!

After the long awaited download we finally managed to get samples of your fantastic data. It must have been intensive work to compile this dataset.

Anyway, I made the retrieved data available for download here:

http://www.datasetcreators.com/downloadedcarfiles/httpretrievals/2011-f02204322-f02207907-42262331-baga6ea4seaqep4jkrisrurawchd2wvei65vgfarvmaihn4yl47vvkrmlani24dy

As this garbage is not what you told the community you were going to store i took the liberty to create a dispute. I hope you don't mind.

Scherm­afbeelding 2023-08-31 om 15 29 07

Killianxuan commented 1 year ago

This picture is not the files we store! Please clarify with yourself.

Killianxuan commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 3 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

Killianxuan commented 1 year ago

Our application has no relation to what @cryptowhizzard said. And we will find more SPs to store data with the help of more datacap to finish it. Hope notaries help us for signature!

Killianxuan commented 1 year ago

image

This is what I've downloaded. We can prove that @cryptowhizzard ‘s result is wrong.

cryptowhizzard commented 1 year ago

Dear Killianxuan,

As notary I am doing due diligence on your LDN. I could not get retrieval to work. Can you please upload the car file of CID baga6ea4seaqep4jkrisrurawchd2wvei65vgfarvmaihn4yl47vvkrmlani24dy ?

You can use our upload system at http://send.datasetcreators.com. Please select 7 days for the system to keep the file and post the link you received here so I (and other notaries) can download your content.

cryptowhizzard commented 1 year ago

@Killianxuan

As per above then, prove me wrong and upload the data.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

Killianxuan commented 1 year ago

image

This is what I've downloaded. We can prove that @cryptowhizzard ‘s result is wrong.

Actually we have proved that cryptowhizzard‘s result is wrong.

Casey-PG commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 3 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

Casey-PG commented 1 year ago

DD has been done, Deal Data Replication needs to be improved in the next round. willing to support!