filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application] NDLABS - Life Sciences Dataset <3/4> #1523

Closed NDLABS-Leo closed 1 year ago

NDLABS-Leo commented 1 year ago

Data Owner Name

NDLABS

Data Owner Country/Region

Singapore

Data Owner Industry

Life Science / Healthcare

Website

https://www.ndlabs.io/#/

Social Media

Twitter: @imNDLABS
Slack: @NDLABS-OFFICE

Total amount of DataCap being requested

5PiB

Weekly allocation of DataCap requested

500TiB

On-chain address for first allocation

f17nerghj2kmg7b4e6asft3xexga5qbzbqe3hi4gy

Custom multisig

Identifier

No response

Share a brief history of your project and organization

ND LABS has technical operation centers and nodes in Singapore, Hong Kong, the United States, and Dubai. Since Fil has launch of the mainnet in 2020, ND has begun to provide technical services to partners to help them complete the construction of storage services. At present, the accumulated storage power of ND exceeds 300P globally. The largest node has 100P storage power, and the node owns exceeds more than 1.4 million FIL. 
ND LABS is positioned as a decentralized storage service provider for WEB3. For a long time, ND not only focuses on building nodes for partners, but also explores how to provide better storage services for potential clients of web3. Since October 2021, ND has been deeply involved in the FilPlus project, vigorously promoting the Filplus project to partners who has effective data storage needs. We also providing them with a complete set of solutions and technical services for storing data in the FIL network. The Singapore and US nodes are the main storage nodes, which was provide real data storage for early customers.

Is this project associated with other projects/ecosystem stakeholders?

Yes

If answered yes, what are the other projects/ecosystem stakeholders

We gonna cooperate with more SPs from other regions

Describe the data being stored onto Filecoin

For the first phase, we gonna store open datasets of Life Sciences from AWS with a total size of 2 PiB. 
Life Sciences Open Datasets has a total of 102 items, including but not limited to the following items. 
-Therapeutically Applicable Research to Generate Effective Treatments (TARGET)
-Gabriella Miller Kids First Pediatric Research Program (Kids First)
-Genome Aggregation Database (gnomAD)
-Allen Cell Imaging Collections
-International Neuroimaging Data-Sharing Initiative (INDI)
-Cell Organelle Segmentation in Electron Microscopy (COSEM) on AWS
-Distributed Archives for Neurophysiology Data Integration (DANDI)

Where was the data currently stored in this dataset sourced from

My Own Storage Infra

If you answered "Other" in the previous question, enter the details here

No response

How do you plan to prepare the dataset

IPFS, lotus

If you answered "other/custom tool" in the previous question, enter the details here

No response

Please share a sample of the data

https://ocg.cancer.gov/programs/target/
https://kidsfirstdrc.org/

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

Yearly

For how long do you plan to keep this dataset stored on Filecoin

1 to 1.5 years

In which geographies do you plan on making storage deals

Greater China, Asia other than Greater China, North America

How will you be distributing your data to storage providers

Cloud storage (i.e. S3), HTTP or FTP server, IPFS, Shipping hard drives, Lotus built-in data transfer

How do you plan to choose storage providers

Slack, Filmine, Partners

If you answered "Others" in the previous question, what is the tool or platform you plan to use

No response

If you already have a list of storage providers to work with, fill out their names and provider IDs below

We are one of the members from Singapore SPWG and we also join in other SPWG. SPs from the SPWG are trustable and experienced. 
Also, we are engaging in encouraging more small SPs to join the Filecoin network. What’s more, We will provide those small SPs who newly join the network with technology support.

How do you plan to make deals to your storage providers

Lotus client

If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline

Yes

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f17nerghj2kmg7b4e6asft3xexga5qbzbqe3hi4gy

Rule to calculate the allocation request amount

400% of weekly dc amount requested

DataCap allocation requested

1.95PiB

Total DataCap granted for client so far

909494701772928712704.0YiB

Datacap to be granted to reach the total amount requested by the client (5PiB)

-1.09B

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
40808 14 1000.0TiB 15.68 224.75TiB
psh0691 commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacedjwse2bchx65a2ybmoobp222inwtfz7gfne7jnaad2hxgitv7opg

Address

f17nerghj2kmg7b4e6asft3xexga5qbzbqe3hi4gy

Datacap Allocated

1.95PiB

Signer Address

f1qdko4jg25vo35qmyvcrw4ak4fmuu3f5rif2kc7i

Id

962733a6-2bf3-4c76-b929-09d8332f414a

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedjwse2bchx65a2ybmoobp222inwtfz7gfne7jnaad2hxgitv7opg

NDLABS-Leo commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

⚠️ 1 storage providers have unknown IP location - f02148382

Deal Data Replication

⚠️ 82.25% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

NDLABS-Leo commented 1 year ago

f02148382 nodes we have contacted them for open ip, in addition the data backup is because the nodes are more than their onboarding time is inconsistent, which is normal. About data sharing has been explained above, and no repeat occurrence behind, I hope to get the notary support

AlanGreaterheat commented 1 year ago

The notary has a normal search and willing to support.

AlanGreaterheat commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacec3badnhgypoas5qm2yor7qm3w67rp475yzj3ptvoxafnzayjuw74

Address

f17nerghj2kmg7b4e6asft3xexga5qbzbqe3hi4gy

Datacap Allocated

1.95PiB

Signer Address

f1pnmzlxj7cfeo2v6oj5nco46hkg2l46wj7o4xxui

Id

962733a6-2bf3-4c76-b929-09d8332f414a

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacec3badnhgypoas5qm2yor7qm3w67rp475yzj3ptvoxafnzayjuw74

kevzak commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

⚠️ All retrieval success ratios are below 1%.

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 91.56% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval report.

data-programs commented 1 year ago
KYC

This user’s identity has been verified through filplus.storage