filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application] NDLABS - Life Sciences Dataset <1/4> #1521

Closed NDLABS-Leo closed 1 year ago

NDLABS-Leo commented 1 year ago

Data Owner Name

NDLABS

Data Owner Country/Region

Singapore

Data Owner Industry

Life Science / Healthcare

Website

https://www.ndlabs.io/#/

Social Media

Twitter: @imNDLABS
Slack: @NDLABS-OFFICE

Total amount of DataCap being requested

5PiB

Weekly allocation of DataCap requested

500TiB

On-chain address for first allocation

f1q6rat3ob7c56ea4hghcj7fkgmjvmqo2lammjxvy

Custom multisig

Identifier

No response

Share a brief history of your project and organization

ND LABS has technical operation centers and nodes in Singapore, Hong Kong, the United States, and Dubai. Since Fil has launch of the mainnet in 2020, ND has begun to provide technical services to partners to help them complete the construction of storage services. At present, the accumulated storage power of ND exceeds 300P globally. The largest node has 100P storage power, and the node owns exceeds more than 1.4 million FIL. 
ND LABS is positioned as a decentralized storage service provider for WEB3. For a long time, ND not only focuses on building nodes for partners, but also explores how to provide better storage services for potential clients of web3. Since October 2021, ND has been deeply involved in the FilPlus project, vigorously promoting the Filplus project to partners who has effective data storage needs. We also providing them with a complete set of solutions and technical services for storing data in the FIL network. The Singapore and US nodes are the main storage nodes, which was provide real data storage for early customers.

Is this project associated with other projects/ecosystem stakeholders?

Yes

If answered yes, what are the other projects/ecosystem stakeholders

We gonna cooperate with more SPs from other regions

Describe the data being stored onto Filecoin

For the first phase, we gonna store open datasets of Life Sciences from AWS with a total size of 2 PiB. 
Life Sciences Open Datasets has a total of 102 items, including but not limited to the following items. 
-Therapeutically Applicable Research to Generate Effective Treatments (TARGET)
-Gabriella Miller Kids First Pediatric Research Program (Kids First)
-Genome Aggregation Database (gnomAD)
-Allen Cell Imaging Collections
-International Neuroimaging Data-Sharing Initiative (INDI)
-Cell Organelle Segmentation in Electron Microscopy (COSEM) on AWS
-Distributed Archives for Neurophysiology Data Integration (DANDI)

Where was the data currently stored in this dataset sourced from

My Own Storage Infra

If you answered "Other" in the previous question, enter the details here

No response

How do you plan to prepare the dataset

IPFS, lotus

If you answered "other/custom tool" in the previous question, enter the details here

No response

Please share a sample of the data

https://ocg.cancer.gov/programs/target/
https://kidsfirstdrc.org/

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

years

For how long do you plan to keep this dataset stored on Filecoin

1 to 1.5 years

In which geographies do you plan on making storage deals

Greater China, Asia other than Greater China, North America

How will you be distributing your data to storage providers

Cloud storage (i.e. S3), HTTP or FTP server, IPFS, Shipping hard drives, Lotus built-in data transfer

How do you plan to choose storage providers

Slack, Filmine, Partners

If you answered "Others" in the previous question, what is the tool or platform you plan to use

No response

If you already have a list of storage providers to work with, fill out their names and provider IDs below

We are one of the members from Singapore SPWG and we also join in other SPWG. SPs from the SPWG are trustable and experienced. 
Also, we are engaging in encouraging more small SPs to join the Filecoin network. What’s more, We will provide those small SPs who newly join the network with technology support.

How do you plan to make deals to your storage providers

Lotus client

If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline

Yes

NDLABS-Leo commented 1 year ago

FYI, about the data replication on the checker report. Please pay attention to the final result rather than each allocation. The final result will be reasonable. Cause even though ND has sent backups to some SPs, each SP has its onboarding plan which ND cannot control.

herrehesse commented 1 year ago

@NDLABS-OFFICE - 5TB on 500TB is not an issue for me. What is an issue is the VPN use of most of your SP's. Can you give me a clear list with business names and locations of the SP's you work with?

Until it is all clear I suggest not to sign this application or its follow-ups.

Screenshot 2023-02-10 at 09 29 09
NDLABS-Leo commented 1 year ago

@herrehesse Please check again, our joint sp is sure that no VPN has been used, the location seen in the check bot is the actual location of the computer room. May I ask what is the basis for judging the VPN of this software?

herrehesse commented 1 year ago

@NDLABS-OFFICE You are talking about the Los Angeles miner or the HK ones?

NDLABS-Leo commented 1 year ago

@herrehesse image Like these, his report is 75% probability, but these are 100% sure not vpn

NDLABS-Leo commented 1 year ago

Hey, @herrehesse Just got in touch with Hidde on slack and with his help, I understand what you are saying. I got 100% confirmation from those SPs that they are not using VPN. Everyone knows that due to the policy issue within Mainland China, Most Chinese SPs migrated their CDN to another country and used the same ISP. Regarding the disclosure of node ownership, I am afraid that it cannot be fully disclosed. As the lead SP, I have a responsibility to protect the information of others. If these SPs participate in mining and are reported by some people, this is illegal. Also, More SPs will collaborate with ND in the next allocation.

Tom-OriginStorage commented 1 year ago
image
Tom-OriginStorage commented 1 year ago

The package is relatively healthy and supports retrieval. I am willing to support him

Tom-OriginStorage commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacebyvaba3vs5hrrebbhznsyjxq32lob5ncu2e3wejsdyqihrex6owm

Address

f1q6rat3ob7c56ea4hghcj7fkgmjvmqo2lammjxvy

Datacap Allocated

1000.00TiB

Signer Address

f1q6bpjlqia6iemqbrdaxr2uehrhpvoju3qh4lpga

Id

fb23b6f6-0776-4834-b2ea-a1659bbb4c57

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebyvaba3vs5hrrebbhznsyjxq32lob5ncu2e3wejsdyqihrex6owm

YuanHeHK commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedi7p57tvcsea4nlkm7xaw34oywnivmd7co3cbiax27b4nra2pyys

Address

f1q6rat3ob7c56ea4hghcj7fkgmjvmqo2lammjxvy

Datacap Allocated

1000.00TiB

Signer Address

f1fg6jkxsr3twfnyhdlatmq36xca6sshptscds7xa

Id

fb23b6f6-0776-4834-b2ea-a1659bbb4c57

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedi7p57tvcsea4nlkm7xaw34oywnivmd7co3cbiax27b4nra2pyys

herrehesse commented 1 year ago

@Tom-OriginStorage As stated during multiple governance calls you were asked not to sign any applications, and first do your due diligence on 50 new applications. I am surprised to see you signing this application and would kindly suggest to revert your signature.

Due diligence is not what you performed and your statement "The package is relatively healthy and supports retrieval. I am willing to support him" is not sufficient.

For transparency tagging: @cryptowhizzard @raghavrmadya @dkkapur @galen-mcandrew @Sunnyiscoming

Tom-OriginStorage commented 1 year ago

@herrehesse For #811 Origin Storage has given a sufficient explanation and has been recognized by everyone. We have confirmed that we can carry out new signatures. Origin Storage has the obligation and responsibility to participate in what the notary should do, just like what we said before Regarding the compliance and inspection of LDN, Origin Storage will continue to conduct inspections according to the rules. Origin Storage also summarizes some inspection methods and tools, which we will share for everyone to use later.

herrehesse commented 1 year ago

So you can confirm yourself that "We can carry out new signatures." that seems highly inappropriate. You indeed had the "obligation and responsibility" to act as a notary which you used to an extreme amount of abusive behaviour.

"Origin Storage will continue to conduct inspections according to the rules" to 50 applications without signing to prove to the community that you are not bad actors.

Proof this, then sign. Trust has to be earned and is not given. Unfortunately @YuanHeHK signed so you can not remove your signature anymore.

NDLABS-Leo commented 1 year ago

@herrehesse @Tom-OriginStorage Let's conclude at the notary meeting.

NDLABS-Leo commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 78.78% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 4

Multisig Notary address

f01858410

Client address

f1q6rat3ob7c56ea4hghcj7fkgmjvmqo2lammjxvy

DataCap allocation requested

1.95PiB

Id

da1248d9-0e33-4510-a8b7-67efb110c50a

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 79.39% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

xiaoyuaiheshui commented 1 year ago

image

xiaoyuaiheshui commented 1 year ago

This SP has a problem that cannot be retrieved. The rest looks good

NDLABS-Leo commented 1 year ago

@xiaoyuaiheshui It may be a network problem, this node supports retrieval

image
NDLABS-Leo commented 1 year ago

@xiaoyuaiheshui you can try again.

lotus client retrieval-ask f01985611 bafykbzaceb4h6cqr7dv4tx4iwy6n7uocul52jp7gz7okwz6iggrrkh5tdidk6

xiaoyuaiheshui commented 1 year ago

Ok, I'll try again. Come back later.

psh0691 commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacebvfa7a7drmyh3aiz5be33ihzhkgem7fknndiea5tmgv3q7ruzasa

Address

f1q6rat3ob7c56ea4hghcj7fkgmjvmqo2lammjxvy

Datacap Allocated

1.95PiB

Signer Address

f1qdko4jg25vo35qmyvcrw4ak4fmuu3f5rif2kc7i

Id

da1248d9-0e33-4510-a8b7-67efb110c50a

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebvfa7a7drmyh3aiz5be33ihzhkgem7fknndiea5tmgv3q7ruzasa

xiaoyuaiheshui commented 1 year ago

image

xiaoyuaiheshui commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedhcslfxsqrjwsn65vcy63calf7erdl3qipicc3gulx7cyzbs3nn6

Address

f1q6rat3ob7c56ea4hghcj7fkgmjvmqo2lammjxvy

Datacap Allocated

1.95PiB

Signer Address

f122qmy25wdtt5mxd77kndiq7z5x2n3iwiuz2wdsa

Id

da1248d9-0e33-4510-a8b7-67efb110c50a

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedhcslfxsqrjwsn65vcy63calf7erdl3qipicc3gulx7cyzbs3nn6

NDLABS-Leo commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 82.59% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

NDLABS-Leo commented 1 year ago

Hi community members and notaries, FYI, here are clarifying data replication: Cause the SPs we cooperate with are located in different areas and their onboarding speed is not in line. The report of data replication will be back to normal when they finish the all data sealing.

During our regular check, there was some CID sharing between some LDNs that we are not aware of. We have located the mistaken SPs (f02000936&f02006374) requiring them to stop data onboarding and hopefully give the solution for this mistake they made. until then, we will not send deals to them. ND has always followed the community's norms to be open and transparent. It is difficult for us to fully guarantee technical issues caused by other SPs. We also kindly wish the notary to support our storage project.

cryptowhizzard commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaced6i2uhyyauuctsh5isjsn6rw5i7btr6jof7coatzocj3o6pdodqa

Address

f1q6rat3ob7c56ea4hghcj7fkgmjvmqo2lammjxvy

Datacap Allocated

1.95PiB

Signer Address

f1krmypm4uoxxf3g7okrwtrahlmpcph3y7rbqqgfa

Id

da1248d9-0e33-4510-a8b7-67efb110c50a

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaced6i2uhyyauuctsh5isjsn6rw5i7btr6jof7coatzocj3o6pdodqa

cryptowhizzard commented 1 year ago

Hi community members and notaries, FYI, here are clarifying data replication: Cause the SPs we cooperate with are located in different areas and their onboarding speed is not in line. The report of data replication will be back to normal when they finish the all data sealing.

During our regular check, there was some CID sharing between some LDNs that we are not aware of. We have located the mistaken SPs (f02000936&f02006374) requiring them to stop data onboarding and hopefully give the solution for this mistake they made. until then, we will not send deals to them. ND has always followed the community's norms to be open and transparent. It is difficult for us to fully guarantee technical issues caused by other SPs. We also kindly wish the notary to support our storage project.

Looking forward to the results.

liyunzhi-666 commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 79.27% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

liyunzhi-666 commented 1 year ago

Based on the CID checker and the responses in the comments, I support this round and look forward to changes in subsequent allocation plans.

liyunzhi-666 commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacebprbmntkk7zkdd3x5i6mbg5s2lepbyivz3j2qqy4z355pp72l2to

Address

f1q6rat3ob7c56ea4hghcj7fkgmjvmqo2lammjxvy

Datacap Allocated

1.95PiB

Signer Address

f1pszcrsciyixyuxxukkvtazcokexbn54amf7gvoq

Id

da1248d9-0e33-4510-a8b7-67efb110c50a

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebprbmntkk7zkdd3x5i6mbg5s2lepbyivz3j2qqy4z355pp72l2to

data-programs commented 1 year ago
KYC

This user’s identity has been verified through filplus.storage