filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application] Speedium - NIH NCBI Sequence Read Archive [3 / 27] #1553

Closed cryptowhizzard closed 1 year ago

cryptowhizzard commented 1 year ago

Data Owner Name

NIH - National Institute of Health

Data Owner Country/Region

United States

Data Owner Industry

Life Science / Healthcare

Website

https://www.nih.gov/

Social Media

https://www.facebook.com/nih.gov/

Total amount of DataCap being requested

5PiB

Weekly allocation of DataCap requested

500TiB

On-chain address for first allocation

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

Custom multisig

Identifier

No response

Share a brief history of your project and organization

Since its launch, the Filecoin network has become an important player in the decentralised storage space, offering a secure and transparent alternative to traditional data storage solutions.

We as Speedium / DCENT have been engaged with storing real and valuable datasets on the Filecoin network since Slingshot 2.6 and have been actively developing tools to improve the process. We are always on the lookout for new and useful client data to onboard.

Is this project associated with other projects/ecosystem stakeholders?

No

If answered yes, what are the other projects/ecosystem stakeholders

No response

Describe the data being stored onto Filecoin

NIH NCBI Sequence Read Archive (SRA) on AWS
The Sequence Read Archive (SRA), produced by the [National Center for Biotechnology Information (NCBI)](https://www.ncbi.nlm.nih.gov/) at the [National Library of Medicine (NLM)](http://nlm.nih.gov/) at the [National Institutes of Health (NIH)](http://www.nih.gov/), stores raw DNA sequencing data and alignment information from high-throughput sequencing platforms.

Where was the data currently stored in this dataset sourced from

AWS Cloud

If you answered "Other" in the previous question, enter the details here

No response

How do you plan to prepare the dataset

IPFS, lotus, singularity, graphsplit

If you answered "other/custom tool" in the previous question, enter the details here

No response

Please share a sample of the data

https://registry.opendata.aws/ncbi-sra/

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

Yearly

For how long do you plan to keep this dataset stored on Filecoin

1.5 to 2 years

In which geographies do you plan on making storage deals

Greater China, Asia other than Greater China, North America, South America, Europe, Australia (continent)

How will you be distributing your data to storage providers

HTTP or FTP server, IPFS, Shipping hard drives, Lotus built-in data transfer

How do you plan to choose storage providers

Slack, Big data exchange, Partners

If you answered "Others" in the previous question, what is the tool or platform you plan to use

No response

If you already have a list of storage providers to work with, fill out their names and provider IDs below

MinerID City Continent Business/Entity
f01944347 Oregon USA Jenny, Dabai
f01952350 Oregon USA Jenny, Dabai
f01972364 Oregon USA Jenny, Dabai
f01972376 Oregon USA Jenny, Dabai
f02000937 Chengdu CN MTY
f01915033 Chengdu CN MTY
f0120**** Melbourne AU HOLON
f0115**** Melbourne AU HOLON
f01199430 Heerhugowaard EU DCENT
f01786387 Heerhugowaard EU DCENT
f01201327 Heerhugowaard EU DCENT
f01937642 Heerhugowaard EU DCENT
f0198**** Dallas USA GREATERHEAT
f0188**** Singapore AS GREATERHEAT
f01091851 Omaha USA DLTx
f01736668 Omaha USA DLTx
f01820744 Omaha USA DLTx
f0855584 Omaha USA DLTx
f01794610 Omaha USA DLTx
f01838599 Kansas City USA DLTx
f01845552 Kansas City USA DLTx

How do you plan to make deals to your storage providers

Boost client, Lotus client, Singularity

If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline

Yes

NiwanDao commented 1 year ago

How many copies you intend to store within one SP?

psh0691 commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacecovm3uhezuy6u77i3w3ekbhooynkarxq5o22cvy6xltydjcob7ci

Address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

Datacap Allocated

1.33PiB

Signer Address

f1qdko4jg25vo35qmyvcrw4ak4fmuu3f5rif2kc7i

Id

b58f5325-9429-451e-ab0b-f174300b7826

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecovm3uhezuy6u77i3w3ekbhooynkarxq5o22cvy6xltydjcob7ci

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 6

Multisig Notary address

f02049625

Client address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

DataCap allocation requested

8.07TiB

Id

e00d65af-6a83-4d52-a1f6-b2ce69f4a5df

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

Last two approvers

psh0691 & not found

Rule to calculate the allocation request amount

800% of weekly dc amount requested

DataCap allocation requested

8.07TiB

Total DataCap granted for client so far

11.22PiB

Datacap to be granted to reach the total amount requested by the client (5PiB)

-7004669722188840B

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
357565 59 1.33PiB 8.73 311.83TiB
filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

⚠️ 2 storage providers sealed too much duplicate data - f01208189: 21.63%, f01208803: 20.81%

Deal Data Replication

⚠️ 39.07% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

xinaxu commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacechhczqjtjjafpgxwdjhsk74n5s4dt5cwptdtj2druvq7gtslicwo

Address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

Datacap Allocated

1.33PiB

Signer Address

f1k3ysofkrrmqcot6fkx4wnezpczlltpirmrpsgui

Id

e00d65af-6a83-4d52-a1f6-b2ce69f4a5df

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacechhczqjtjjafpgxwdjhsk74n5s4dt5cwptdtj2druvq7gtslicwo

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 7

Multisig Notary address

f02049625

Client address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

DataCap allocation requested

10.24GiB

Id

1eb12780-d805-43ac-aa60-06b3f4670d20

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

Last two approvers

xinaxu & psh0691

Rule to calculate the allocation request amount

800% of weekly dc amount requested

DataCap allocation requested

10.24GiB

Total DataCap granted for client so far

12.55PiB

Datacap to be granted to reach the total amount requested by the client (5PiB)

-8502116598289530B

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
393066 61 8.07TiB 7.94 4.21GiB
filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

⚠️ 2 storage providers sealed too much duplicate data - f01208189: 20.70%, f01208803: 20.66%

Deal Data Replication

⚠️ 35.15% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

GhostByteInc commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacede722pwj6anlk4vfkxmouthcttgzha2mqfrruunzay2jen7orud6

Address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

Datacap Allocated

10.24GiB

Signer Address

f1437jngablusaizqizc6sxp4r5llioq7mp2eqlii

Id

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacede722pwj6anlk4vfkxmouthcttgzha2mqfrruunzay2jen7orud6

data-programs commented 1 year ago
KYC

This user’s identity has been verified through filplus.storage