filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application] Coupled Model Intercomparison Project 6 (CMIP6) [3/3] #1892

Closed VincentShii closed 1 year ago

VincentShii commented 1 year ago

Data Owner Name

ESGF and Pangeo

Data Owner Country/Region

United States

Data Owner Industry

Life Science / Healthcare

Website

https://www.wcrp-climate.org/wgcm-cmip/wgcm-cmip6

Social Media

https://pangeo-data.github.io/pangeo-cmip6-cloud/

Total amount of DataCap being requested

5PiB

Weekly allocation of DataCap requested

1PiB

On-chain address for first allocation

f1nefzwhevpd3u5x6i7bpagkkvcnyrq6ukcjo35ri

Data Type of Application

Public, Open Dataset (Research/Non-Profit)

Custom multisig

Identifier

No response

Share a brief history of your project and organization

Organization: Yuga was founded in 2019 with core members mainly from Microsoft, AWS and other major leading organizations. Till now we have about 20 employees in Shanghai office. We've been dedicated to building and optimizing industry-leading order encapsulation components for three years, enabling rapid processing and encapsulation of real data, and continuously empowering storage providers and data content providers.

Project: CMIP began in 1995 under the auspices of the Working Group on Coupled Modelling (WGCM). The first set of common experiments involved comparing the model response to an idealized forcing – a constant rate of increase which was accomplished using a CO2 increase of 1% per year compounded.

Is this project associated with other projects/ecosystem stakeholders?

Yes

If answered yes, what are the other projects/ecosystem stakeholders

No response

Describe the data being stored onto Filecoin

Zarr formatted data
The sixth phase of global coupled ocean-atmosphere general circulation model ensemble.

Where was the data currently stored in this dataset sourced from

AWS Cloud

If you answered "Other" in the previous question, enter the details here

No response

How do you plan to prepare the dataset

lotus

If you answered "other/custom tool" in the previous question, enter the details here

No response

Please share a sample of the data

https://registry.opendata.aws/cmip6/
s3://cmip6-pds/

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

Yearly

For how long do you plan to keep this dataset stored on Filecoin

1.5 to 2 years

In which geographies do you plan on making storage deals

Asia other than Greater China, North America, South America, Europe

How will you be distributing your data to storage providers

Shipping hard drives, Lotus built-in data transfer

How do you plan to choose storage providers

Slack, Filmine, Big data exchange, Partners

If you answered "Others" in the previous question, what is the tool or platform you plan to use

No response

If you already have a list of storage providers to work with, fill out their names and provider IDs below

No response

How do you plan to make deals to your storage providers

Lotus client

If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline

Yes

large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

Sunnyiscoming commented 1 year ago

Datacap Request Trigger

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

1PiB

Client address

f1nefzwhevpd3u5x6i7bpagkkvcnyrq6ukcjo35ri

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f1nefzwhevpd3u5x6i7bpagkkvcnyrq6ukcjo35ri

DataCap allocation requested

256TiB

Id

db8a9067-0f3e-431c-871f-cf1d66603f9e

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f1nefzwhevpd3u5x6i7bpagkkvcnyrq6ukcjo35ri

DataCap allocation requested

256TiB

Id

618942e6-e6e7-491b-b3a9-0edbbf7a513c

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

No application info found for this issue on https://filplus.d.interplanetary.one/clients.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

cryptowhizzard commented 1 year ago

Dear @Ehume-Yuga ,

Thank you for applying for datacap. As Filecoin FIL+ notary i am screening your application and conducting due diligence.

Looking at your application i have some questions: As you are brand new on Github and have no history of past applications. Can you give me an introduction who you are and an introduction about your company?

You stated that you are a Data preparer. What hardware do you have over there to prepare this dataset? What is your internet connection speed? Do you already have the data downloaded on the premises?

It seems to me that applying for 5PB of datacap is a lot. One needs comprehensive knowledge of Filecoin, packing of data, distribution of data and all it's requirements coming with it. Are you brand new in the Filecoin space or have you applied for datacap in the past on different Github account names?

Can you show us some visible proof of the size of your data and the storage / packing hardware you have there?

As last question i would like you to fill out this form to provide us with the necessary information to make a educated decision on your LDN request if we would like to support it.

Thanks!

TakiChain commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacebkrdh2a63tha5vav73nytirduavy7oue2lehdna7mqedfxk53qee

Address

f1nefzwhevpd3u5x6i7bpagkkvcnyrq6ukcjo35ri

Datacap Allocated

256.00TiB

Signer Address

f15impf3j2zcaex4lhyxndxswuuhv24vzstuqtxsi

Id

618942e6-e6e7-491b-b3a9-0edbbf7a513c

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebkrdh2a63tha5vav73nytirduavy7oue2lehdna7mqedfxk53qee

BobbyChoii commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzaceafkgpogsiyru25hepi4tb3ta5wuywjctug2i54q2avzmr3bdftmu

Address

f1nefzwhevpd3u5x6i7bpagkkvcnyrq6ukcjo35ri

Datacap Allocated

256.00TiB

Signer Address

f1irqs2gmctiv3jcdfwuch7oxvf4ixh3k4b2wc24i

Id

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceafkgpogsiyru25hepi4tb3ta5wuywjctug2i54q2avzmr3bdftmu

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 2

Multisig Notary address

f02049625

Client address

f1nefzwhevpd3u5x6i7bpagkkvcnyrq6ukcjo35ri

DataCap allocation requested

512TiB

Id

b5b949ad-a3f9-44b6-b72e-024df6bec72b

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f02049625

Client address

f1nefzwhevpd3u5x6i7bpagkkvcnyrq6ukcjo35ri

Rule to calculate the allocation request amount

10% of total dc amount requested

DataCap allocation requested

512TiB

Total DataCap granted for client so far

256TiB

Datacap to be granted to reach the total amount requested by the client (5PiB)

4.75PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
null null 256TiB null 64.09TiB
Casey-PG commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaced3puinmubltdxhf5cn6fgleco24hneb6lyramrsghx5mwb3qpdie

Address

f1nefzwhevpd3u5x6i7bpagkkvcnyrq6ukcjo35ri

Datacap Allocated

512.00TiB

Signer Address

f1d4yb3wags3mtddzesxoo63jv7dmlec3bq4yteni

Id

b5b949ad-a3f9-44b6-b72e-024df6bec72b

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaced3puinmubltdxhf5cn6fgleco24hneb6lyramrsghx5mwb3qpdie

MEIYAN666 commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzaceb3ilemrrd7srs3xfsdhyrdzxz6anmmpecmcrqlof4lbshjatu52a

Address

f1nefzwhevpd3u5x6i7bpagkkvcnyrq6ukcjo35ri

Datacap Allocated

512.00TiB

Signer Address

f1bwugfihrmn3iyunzyxst5nttql3dge4khwmurtq

Id

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceb3ilemrrd7srs3xfsdhyrdzxz6anmmpecmcrqlof4lbshjatu52a

MEIYAN666 commented 1 year ago

The dataset is valuable, the client has shared the allocation plan. We're confident to kick off this application.

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 3

Multisig Notary address

f02049625

Client address

f1nefzwhevpd3u5x6i7bpagkkvcnyrq6ukcjo35ri

DataCap allocation requested

1PiB

Id

90b55123-cb5d-4848-8795-c75d2cc2e902

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f02049625

Client address

f1nefzwhevpd3u5x6i7bpagkkvcnyrq6ukcjo35ri

Rule to calculate the allocation request amount

20% of total dc amount requested

DataCap allocation requested

1PiB

Total DataCap granted for client so far

465661.3YiB

Datacap to be granted to reach the total amount requested by the client (5PiB)

-5.62B

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
null null 512TiB null 129.84TiB
TakiChain commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacecoataj4lmnvfei7obxsfc52swqbp64hjhnvxvhgtjqupl5wbzg6k

Address

f1nefzwhevpd3u5x6i7bpagkkvcnyrq6ukcjo35ri

Datacap Allocated

1.00PiB

Signer Address

f15impf3j2zcaex4lhyxndxswuuhv24vzstuqtxsi

Id

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecoataj4lmnvfei7obxsfc52swqbp64hjhnvxvhgtjqupl5wbzg6k

cryptowhizzard commented 1 year ago

Hi there,

I have done due diligence on your application.

None of the data is retrievable. This is against Fil+ guidelines. The data is only distributed in HongKong, not outside Asia. This is also against FIL+ guidelines.

Any reason why you are not following on the FIL+ rules?

Scherm­afbeelding 2023-05-05 om 20 02 04
TakiChain commented 1 year ago

In our opinion, data retrievals don't require and cannot be done by all notaries, because these are limited and affected by the quality of network links and regional policies in each region, which is the main reason why the community needs to select so many notaries in each region. But for the DPs, the selected storage nodes need to be "subjectively" public to any regions for retrieval. We need to be aware of the subtle differences in the nature of these two things, especially for notaries.

We suggest that notaries who are currently unable to or can only partially successfully complete retrievals temporarily should not be anxious and can refer to the CID reports before you make the decision of signing. Don't forget that you do have the right to choose not to sign. The key indicators that are constantly updated are reflected in the CID reports by P.L., which are the main reference materials we use for our signing.

BobbyChoii commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedxjjv4jn54q5jkmcs3rbtlr6j6p26zonswkbbt4ks5kut52ydrg2

Address

f1nefzwhevpd3u5x6i7bpagkkvcnyrq6ukcjo35ri

Datacap Allocated

1.00PiB

Signer Address

f1irqs2gmctiv3jcdfwuch7oxvf4ixh3k4b2wc24i

Id

90b55123-cb5d-4848-8795-c75d2cc2e902

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedxjjv4jn54q5jkmcs3rbtlr6j6p26zonswkbbt4ks5kut52ydrg2

BobbyChoii commented 1 year ago

Retrieval works fine, willing to support!

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 5

Multisig Notary address

f02049625

Client address

f1nefzwhevpd3u5x6i7bpagkkvcnyrq6ukcjo35ri

DataCap allocation requested

2.25PiB

Id

c8f94f9a-3bac-4e57-8988-f32c9e2636ba

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f02049625

Client address

f1nefzwhevpd3u5x6i7bpagkkvcnyrq6ukcjo35ri

Rule to calculate the allocation request amount

80% of total dc amount requested

DataCap allocation requested

2.25PiB

Total DataCap granted for client so far

9.313225746154786e+36YiB

Datacap to be granted to reach the total amount requested by the client (5PiB)

-1.12B

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
65505 13 1PiB 17.64 239.84TiB
Bennyyangpu commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 35.52% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

Bennyyangpu commented 1 year ago

The report looks healthy, will to support!

Bennyyangpu commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacecshuijxuzakbj5ljqb3odstvv3do2pwzfiirsjpnqp6byz6vzzj2

Address

f1nefzwhevpd3u5x6i7bpagkkvcnyrq6ukcjo35ri

Datacap Allocated

2.25PiB

Signer Address

f174fg3bqbln3zjnkxtyf6s54txqkr7yqkj6cig7y

Id

c8f94f9a-3bac-4e57-8988-f32c9e2636ba

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecshuijxuzakbj5ljqb3odstvv3do2pwzfiirsjpnqp6byz6vzzj2

AthSmith commented 1 year ago

No duplicate data,healthy report. Please keep the allocation more decentralized in the future.

AthSmith commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzaced5aox7vexlqguj7u7hpthvoik4cahna3kjp5tim4lhcicd4yr3pu

Address

f1nefzwhevpd3u5x6i7bpagkkvcnyrq6ukcjo35ri

Datacap Allocated

2.25PiB

Signer Address

f1vxbqrf7rfum3n6m5u6eb4re6xj7amvsaqnzu64y

Id

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaced5aox7vexlqguj7u7hpthvoik4cahna3kjp5tim4lhcicd4yr3pu

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

VincentShii commented 1 year ago

Got it.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!

VincentShii commented 1 year ago

please reopen it

Sunnyiscoming commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 57.16% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

Sunnyiscoming commented 1 year ago

It seems that the retrieval rate of some sp shows 0, can you explain?

VincentShii commented 1 year ago

@Sunnyiscoming I have commanded SPs to support retrieval. Some SPs are updating their code.

cryptowhizzard commented 1 year ago

This client is actively stalling http retrievals and blocked http ranged requests with a reverse proxy to prevent it's data being investigated.

It works as follows:

One set's a bandwidth limit with NGINX on the HTTP retrieval. After a random certain amount the limit is set to zero. This makes the transfer timeout. Because range retrieval is disabled in NGINX one cannot pick up where he left and needs to start all over again.

Log can be found at http://datasetcreators.com/downloadedcarfiles/logs/1892.log

ghost commented 1 year ago

Hello @VincentShii per the new guidelines https://github.com/filecoin-project/notary-governance/issues/922 for Open Dataset applicants, please complete the following Fil+ registration form to identify yourself as the applicant and also please add the contact information of the SP entities you are working with to store copies of the data.

This information will be reviewed by Fil+ Governance team to confirm validity toward the Fil+ guideline of a distributed storage plan and SPs posted in the comments here. Let us know if you have any questions.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

large-datacap-requests[bot] commented 1 year ago

The issue reached the total datacap requested. This should be closed

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f02049625

Client address

f1nefzwhevpd3u5x6i7bpagkkvcnyrq6ukcjo35ri

Rule to calculate the allocation request amount

total dc reached

DataCap allocation requested

0

Total DataCap granted for client so far

2.0954757928848267e+53YiB

Datacap to be granted to reach the total amount requested by the client (5PiB)

2.0954757928848267e+53YiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
98494 21 2.25PiB 11.14 1.08PiB
github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!

-- Commented by Stale Bot.