filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
109 stars 62 forks source link

[DataCap Application] Public Datasets --- Lidar+4DN+Nanopore+Sentinel+Sunflower Genome #1908

Closed enochleon closed 1 year ago

enochleon commented 1 year ago

Data Owner Name

NOAA; 4DN-DCIC; NWGSC; NRC; UBC

Data Owner Country/Region

United States

Data Owner Industry

Other

Website

http://www.noaa.gov/

Social Media

https://twitter.com/NOAA
https://twitter.com/4dn_dcic
https://www.facebook.com/EnvironmentandNaturalResourcesinCanada/
https://twitter.com/NRCan
https://www.youtube.com/user/NaturalResourcesCa

Total amount of DataCap being requested

5PiB

Weekly allocation of DataCap requested

800TiB

On-chain address for first allocation

f14ckrd6rtf5suqk7vbrr567pvyarma6pwivscoea

Data Type of Application

None

Custom multisig

Identifier

No response

Share a brief history of your project and organization

I'm a DP. I prepare public data and provide car files to SP.
NOAA Coastal Lidar Data: NOAA is an agency that enriches life through science. Our reach goes from the surface of the sun to the depths of the ocean floor as we work to keep the public informed of the changing environment around them.
4D Nucleome (4DN): The 4D Nucleome Network aims to understand the principles behind the three-dimensional organization of the nucleus in space and time (the 4th dimension) and the role nuclear organization plays in gene expression and cellular function. The Network will utilize existing omics and imaging technologies as well as develop new ones to generate data and create resources to enable the study of the 4D Nucleome.
Nanopore Reference Human Genome: This dataset includes the sequencing and assembly of a reference standard human genome (GM12878) using the MinION nanopore sequencing instrument with the R9.4 1D chemistry.
Sentinel Near Real-time Canada Mirror: Near Real-time (NRT) Sentinel Mirror connected to the EU Copernicus programme, focused on Canadian coverage. In 2015, Canada joined the Sentinel collaborative ground segment which introduced an NRT Sentinel mirror site for users and programs inside the Government of Canada (GC).
University of British Columbia Sunflower Genome Dataset: This dataset is a catalogue of functionally important variation that can be used to create better-adapted sunflower varieties (e.g. developing breeding lines with improved drought tolerance) and to study the genetic mechanisms driving evolution and adaptation.

Is this project associated with other projects/ecosystem stakeholders?

No

If answered yes, what are the other projects/ecosystem stakeholders

No response

Describe the data being stored onto Filecoin

NOAA Coastal Lidar Dataset
Released and archived 4DNucleome data
Nanopore Reference Human Genome
NRT Sentinel data in an S3 bucket broken down by sensor, product type and date.
UBC Sunflower Genome Data 1

Where was the data currently stored in this dataset sourced from

AWS Cloud

If you answered "Other" in the previous question, enter the details here

No response

How do you plan to prepare the dataset

lotus, singularity

If you answered "other/custom tool" in the previous question, enter the details here

No response

Please share a sample of the data

https://registry.opendata.aws/noaa-coastal-lidar/
https://registry.opendata.aws/4dnucleome/
https://registry.opendata.aws/nanopore/
https://registry.opendata.aws/sentinel-products-ca-mirror/
https://registry.opendata.aws/ubc-sunflower-genome/

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

Yearly

For how long do you plan to keep this dataset stored on Filecoin

1 to 1.5 years

In which geographies do you plan on making storage deals

Greater China, Asia other than Greater China, North America, South America, Europe, Australia (continent)

How will you be distributing your data to storage providers

HTTP or FTP server, Shipping hard drives, Lotus built-in data transfer

How do you plan to choose storage providers

Slack, Filmine, Big data exchange

If you answered "Others" in the previous question, what is the tool or platform you plan to use

No response

If you already have a list of storage providers to work with, fill out their names and provider IDs below

No response

How do you plan to make deals to your storage providers

Lotus client, Singularity

If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline

Yes

large-datacap-requests[bot] commented 1 year ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
large-datacap-requests[bot] commented 1 year ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

Sunnyiscoming commented 1 year ago

Datacap Request Trigger

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

800TiB

Client address

f14ckrd6rtf5suqk7vbrr567pvyarma6pwivscoea

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f14ckrd6rtf5suqk7vbrr567pvyarma6pwivscoea

DataCap allocation requested

256TiB

Id

59506ebd-1f73-47b5-8e7f-d0dba4f7291a

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f14ckrd6rtf5suqk7vbrr567pvyarma6pwivscoea

DataCap allocation requested

256TiB

Id

a9ae7922-42b4-4e3b-b152-674fd9509a7b

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

No application info found for this issue on https://filplus.d.interplanetary.one/clients.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

Bennyyangpu commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacebki2vpag7artvzsqikpbdsaranxo63xr2optbblrqjvwr3kkxdjq

Address

f14ckrd6rtf5suqk7vbrr567pvyarma6pwivscoea

Datacap Allocated

256.00TiB

Signer Address

f174fg3bqbln3zjnkxtyf6s54txqkr7yqkj6cig7y

Id

a9ae7922-42b4-4e3b-b152-674fd9509a7b

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebki2vpag7artvzsqikpbdsaranxo63xr2optbblrqjvwr3kkxdjq

MEIYAN666 commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacecowezqae3tqixzi4elg6imf4rstwlpfultkdiafor4kq54hupizk

Address

f14ckrd6rtf5suqk7vbrr567pvyarma6pwivscoea

Datacap Allocated

256.00TiB

Signer Address

f1bwugfihrmn3iyunzyxst5nttql3dge4khwmurtq

Id

a9ae7922-42b4-4e3b-b152-674fd9509a7b

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecowezqae3tqixzi4elg6imf4rstwlpfultkdiafor4kq54hupizk

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 2

Multisig Notary address

f02049625

Client address

f14ckrd6rtf5suqk7vbrr567pvyarma6pwivscoea

DataCap allocation requested

512TiB

Id

c6822b80-8974-49b2-8b2a-9f10878d93fa

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f02049625

Client address

f14ckrd6rtf5suqk7vbrr567pvyarma6pwivscoea

Rule to calculate the allocation request amount

10% of total dc amount requested

DataCap allocation requested

512TiB

Total DataCap granted for client so far

256TiB

Datacap to be granted to reach the total amount requested by the client (5PiB)

4.75PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
null null 256TiB null 104.56TiB
BobbyChoii commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceb54el6zwcrmsgbpa4tczbrpavcjao6d2rjniqz5fy6ikc5bxoe4a

Address

f14ckrd6rtf5suqk7vbrr567pvyarma6pwivscoea

Datacap Allocated

512.00TiB

Signer Address

f1irqs2gmctiv3jcdfwuch7oxvf4ixh3k4b2wc24i

Id

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceb54el6zwcrmsgbpa4tczbrpavcjao6d2rjniqz5fy6ikc5bxoe4a

TakiChain commented 1 year ago

Thank you for integrating more trustworthy data into the network and please continue to follow fileplus' guidelines.

TakiChain commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedmaom6qu3jthtq2vy463i2gdmggga45h7tqqf4kurjqmrstwltaq

Address

f14ckrd6rtf5suqk7vbrr567pvyarma6pwivscoea

Datacap Allocated

512.00TiB

Signer Address

f15impf3j2zcaex4lhyxndxswuuhv24vzstuqtxsi

Id

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedmaom6qu3jthtq2vy463i2gdmggga45h7tqqf4kurjqmrstwltaq

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 3

Multisig Notary address

f02049625

Client address

f14ckrd6rtf5suqk7vbrr567pvyarma6pwivscoea

DataCap allocation requested

1PiB

Id

68c46316-28c0-4108-a4e6-ed9b3daaab4f

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f02049625

Client address

f14ckrd6rtf5suqk7vbrr567pvyarma6pwivscoea

Rule to calculate the allocation request amount

20% of total dc amount requested

DataCap allocation requested

1PiB

Total DataCap granted for client so far

465661.3YiB

Datacap to be granted to reach the total amount requested by the client (5PiB)

-5.62B

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
null null 512TiB null 138.75TiB
Suyanj commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacecv3dje6lrah3g2ui4maumy47sdbfv2dknkard6b4w2sx3z73iukg

Address

f14ckrd6rtf5suqk7vbrr567pvyarma6pwivscoea

Datacap Allocated

1.00PiB

Signer Address

f1ihv7gz3vn3xqvikpt4rwryecgisl7745lodx3yi

Id

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecv3dje6lrah3g2ui4maumy47sdbfv2dknkard6b4w2sx3z73iukg

AthSmith commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedvwujpwpl733zjonoxtz7osg3rnlumvpuye6xgsnjn3dhvhndc72

Address

f14ckrd6rtf5suqk7vbrr567pvyarma6pwivscoea

Datacap Allocated

1.00PiB

Signer Address

f1vxbqrf7rfum3n6m5u6eb4re6xj7amvsaqnzu64y

Id

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedvwujpwpl733zjonoxtz7osg3rnlumvpuye6xgsnjn3dhvhndc72

cryptowhizzard commented 1 year ago

Hi,

I have been checking on your application and SP's. None of them allow retrieval. Can you tell me why this is? This is against the rules of FIL+

I am interested in CID baga6ea4seaqftvuvldjh4wttbvrhp76266fclh6bm77dvbzgpq7zdkntwljzccy , deal - ID 35149980 and want to see the data you have been storing.

lotus state get-deal 35149980 { "Proposal": { "PieceCID": { "/": "baga6ea4seaqftvuvldjh4wttbvrhp76266fclh6bm77dvbzgpq7zdkntwljzccy" }, "PieceSize": 34359738368, "VerifiedDeal": true, "Client": "f02130992", "Provider": "f02173949", "Label": "mAXCg5AIgKdIGewByY1YOxZfmL5bdhgDRkuefVWrjU5FAMm0LuhI", "StartEpoch": 2844943, "EndEpoch": 4385743, "StoragePricePerEpoch": "0", "ProviderCollateral": "9464109606315740", "ClientCollateral": "0" }, "State": { "SectorStartEpoch": 2831194, "LastUpdatedEpoch": -1, "SlashEpoch": -1, "VerifiedClaim": 17464903 } }

Scherm­afbeelding 2023-05-05 om 18 11 51 Scherm­afbeelding 2023-05-05 om 18 11 48
Casey-PG commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 33.83% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 4

Multisig Notary address

f02049625

Client address

f14ckrd6rtf5suqk7vbrr567pvyarma6pwivscoea

DataCap allocation requested

2PiB

Id

0ec9d2b2-4922-41ec-8b04-0d61f4b0d747

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f02049625

Client address

f14ckrd6rtf5suqk7vbrr567pvyarma6pwivscoea

Rule to calculate the allocation request amount

40% of total dc amount requested

DataCap allocation requested

2PiB

Total DataCap granted for client so far

931322574615478927360.0YiB

Datacap to be granted to reach the total amount requested by the client (5PiB)

-1.12B

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
null null 1PiB null 265.75TiB
Bennyyangpu commented 1 year ago

The report looks healthy. Please keep your request in accordance with the principles of the program and in line with their allocation strategy.

Bennyyangpu commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacectvb3sbq4pjyt63jiwc3l3xgiovzmbg625xk4zci6b3ecmpzdd5m

Address

f14ckrd6rtf5suqk7vbrr567pvyarma6pwivscoea

Datacap Allocated

2.00PiB

Signer Address

f174fg3bqbln3zjnkxtyf6s54txqkr7yqkj6cig7y

Id

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacectvb3sbq4pjyt63jiwc3l3xgiovzmbg625xk4zci6b3ecmpzdd5m

enochleon commented 1 year ago

@cryptowhizzard Hi notary, it seems that the network of sp is a big influence on the retrieval. I will contact with them.

TakiChain commented 1 year ago

In our opinion, data retrievals don't require and cannot be done by all notaries, because these are limited and affected by the quality of network links and regional policies in each region, which is the main reason why the community needs to select so many notaries in each region. But for the DPs, the selected storage nodes need to be "subjectively" public to any regions for retrieval. We need to be aware of the subtle differences in the nature of these two things, especially for notaries.

We suggest that notaries who are currently unable to or can only partially successfully complete retrievals temporarily should not be anxious and can refer to the CID reports before you make the decision of signing. Don't forget that you do have the right to choose not to sign. The key indicators that are constantly updated are reflected in the CID reports by P.L., which are the main reference materials we use for our signing.

TakiChain commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacecfdhjmlauo2wwlifgq3bgsxqixy5bveaputbi3cclv53m5anny6y

Address

f14ckrd6rtf5suqk7vbrr567pvyarma6pwivscoea

Datacap Allocated

2.00PiB

Signer Address

f15impf3j2zcaex4lhyxndxswuuhv24vzstuqtxsi

Id

0ec9d2b2-4922-41ec-8b04-0d61f4b0d747

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecfdhjmlauo2wwlifgq3bgsxqixy5bveaputbi3cclv53m5anny6y

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

enochleon commented 1 year ago

@github-actions Thanks for the reminder.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!

cryptowhizzard commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

⚠️ All retrieval success ratios are below 1%.

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

cryptowhizzard commented 1 year ago

@Aifabot-Cloud @enochleon @TakiChain

Care to explain what is going on?

herrehesse commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

cryptowhizzard commented 1 year ago

Hi,

This application is solely working with SP's who have retrieval disabled and/or are involved in CID sharing abuse.

Scherm­afbeelding 2023-07-31 om 13 25 07

Wengeding commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

cryptowhizzard commented 1 year ago

All these SP's are involved in CID sharing and do not support retrieval, against FIL+ rules and guidelines.

Scherm­afbeelding 2023-07-31 om 18 07 53
filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.