filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application] IINDA - medicine #916

Closed JackRipper1888 closed 1 year ago

JackRipper1888 commented 2 years ago

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

IINDA (Chengdu ) Technology Co., Ltd., established in May 2015, is an innovative technology company dedicated to the field of Internet medical imaging. Industry-leading technology solutions. At the beginning of the business, with the entrepreneurial mission of "bringing high-quality imaging diagnosis to patients in remote third- and fourth-tier areas", it has continued to serve more than 600 medical institutions and customers through high-quality technology research and development and service capabilities.

The company independently develops and provides a series of high-quality medical imaging software products and services such as imaging cloud integrated machine, hospital-level PACS system, electronic film, ipacs imaging cloud platform, and IMDT multidisciplinary consultation platform, and has passed the double certification of Sichuan Software Industry Association. Soft (software product and software enterprise) certification.

With the rapid development of the medical industry, Yingda Technology has expanded from providing medical imaging cloud platform, electronic film, cloud archiving, in-hospital PACS and other products to web-based products.
3.0 in the field of blockchain medical infrastructure research.

The company's main products:
(i) Electronic film solution: Electronic film is also called cloud film, that is, the original image of medical images stored in cloud space or server disk space, non-JPG, JPEG format, high-precision rendering of DICOM original complete full-pixel image, retaining all sequences
(ii) Medical Consortium Internet Image Cloud Comprehensive Solution: A comprehensive medical imaging solution for various forms of medical consortia, based on HTML5 full-pixel rendering technology to ensure diagnostic-level accuracy on any platform, based on B/S architecture, It seamlessly integrates image resources with cloud HIS and Internet systems to maximize sharing.
(iv) Hospital-level PACS comprehensive solution: a new hospital-level comprehensive medical imaging solution with Internet attributes, which allows flexible access to images in hospitals and clinical departments, and integrates hospital resources through the medical imaging Internet mobile office solution.

What is the primary source of funding for this project?

Company funds

What other projects/ecosystem stakeholders is this project associated with?

None

Use-case details

Describe the data being stored onto Filecoin

Medical imaging data and archived data without private information

Where was the data in this dataset sourced from?

Company data

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

https://github.com/filecoin-project/filecoin-plus-large-datasets/files/9149500/iinda.zip

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

Yes, the data we store is public data without private information.

What is the expected retrieval frequency for this data?

1 - 2 times a year

For how long do you plan to keep this dataset stored on Filecoin?

540 days

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

Hongkong

How will you be distributing your data to storage providers? Is there an offline data transfer process?

offline transfer

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

We would like to choose a storage service provider in Hong Kong. We require data to be retrievable, which is a requirement of our choice of storage provider.

How will you be distributing deals across storage providers?

For data security, we will store multiple copies of data to different storage service providers.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Yes
large-datacap-requests[bot] commented 2 years ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
large-datacap-requests[bot] commented 2 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

raghavrmadya commented 2 years ago

@JackRipper1888 , what is your relationship with the client?

JackRipper1888 commented 2 years ago

Thanks for your question. I am applying on behalf of organization and we would like to join filecoin to reduce our costs. If you need, I can use the official mailbox of the company domain name to send the certification information

JackRipper1888 commented 1 year ago

We have sent a verification email to filplus@fil.org . Hope to get feedback, we want to join Filecoin. Thank you @raghavrmadya image

Sunnyiscoming commented 1 year ago

Can you explain your data composition? And to provide more data samples related with data you will store to prove that you have 5 PB already. How many copies will you store? Can you provide more detailed information about other storage providers participated in this program, such as you can list SPs you have contacted with at present? Could you send an email to filplus-app-review@fil.org with your official domain in order to confirm your identity? Email name should includes the issue id #916.

Sunnyiscoming commented 1 year ago

Any update here?

JackRipper1888 commented 1 year ago

Hello, the data consists of: desensitized medical image data and desensitized archival data. We store data to multiple storage providers and store 4-5 backups to ensure reliable data storage. Also, we are preparing new sample data, which will be published later. We sent a verification email: 5091666009566_ pic

JackRipper1888 commented 1 year ago

@Sunnyiscoming @raghavrmadya We have added sample data: https://drive.google.com/drive/folders/1P6sPZOPMh3EiKzpEDExqraR04Cv-8Q0W. Hope to make progress, thanks

JackRipper1888 commented 1 year ago

@raghavrmadya @Sunnyiscoming Hi, is there an update? looking forward to being approved, thanks

Sunnyiscoming commented 1 year ago

@JackRipper1888 Sorry for delaying. Email received. Can you provide more detailed information about other storage providers participated in this program, such as you can list SPs you have contacted with at present?

JackRipper1888 commented 1 year ago

@Sunnyiscoming Hi, thanks for your reply. We have only communicated with f01851482 f01852325 f01915287 f01907545 so far. And we are still looking for SP, our allocation strategy will fully meet the requirements of the community, thanks again, looking forward to progress!

simonkim0515 commented 1 year ago

Datacap Request Trigger

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

400TiB

Client address

f1s72ymkzujzb5hnxfaqkgekn37nbisu4pc26eh3i

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address

f01858410

Client address

f1s72ymkzujzb5hnxfaqkgekn37nbisu4pc26eh3i

DataCap allocation requested

200TiB

Id

e612bb9f-86df-49a6-98d2-04a4ecef1e87

kernelogic commented 1 year ago

@JackRipper1888 what regions are these SPs that you listed? And can you specify how you meet the requirements of the community?

JackRipper1888 commented 1 year ago

@kernelogic Hello, we plan to store data in Hong Kong/Singapore/Korea, etc., we have contacted the following SP: f01852023, f01964002, f01852664, f01967473, f01851482, we will cooperate further. For our data security and according to community rules, we evenly distribute our data to multiple SPs. thanks

kernelogic commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceaprtymgdgmn4wnhzsax2kmlgbq2ewtxbfg7fpsuazw5egbr4lpmw

Address

f1s72ymkzujzb5hnxfaqkgekn37nbisu4pc26eh3i

Datacap Allocated

200.00TiB

Signer Address

f1yjhnsoga2ccnepb7t3p3ov5fzom3syhsuinxexa

Id

e612bb9f-86df-49a6-98d2-04a4ecef1e87

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceaprtymgdgmn4wnhzsax2kmlgbq2ewtxbfg7fpsuazw5egbr4lpmw

Tom-OriginStorage commented 1 year ago

it looks good i will sign him

Tom-OriginStorage commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacededesv6uyrgsstptbtomoj6jffcoz35f6a52a4yrga2nt5v7uzli

Address

f1s72ymkzujzb5hnxfaqkgekn37nbisu4pc26eh3i

Datacap Allocated

200.00TiB

Signer Address

f1q6bpjlqia6iemqbrdaxr2uehrhpvoju3qh4lpga

Id

e612bb9f-86df-49a6-98d2-04a4ecef1e87

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacededesv6uyrgsstptbtomoj6jffcoz35f6a52a4yrga2nt5v7uzli

Sunnyiscoming commented 1 year ago

Is there any problem with using datacap?

JackRipper1888 commented 1 year ago

We are preparing data, and communicating with SPs, I think it will start soon, thank you for your attention @Sunnyiscoming

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 2

Multisig Notary address

f01858410

Client address

f1s72ymkzujzb5hnxfaqkgekn37nbisu4pc26eh3i

DataCap allocation requested

400TiB

Id

e797fcac-eebd-4507-b203-4650e7641790

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f1s72ymkzujzb5hnxfaqkgekn37nbisu4pc26eh3i

Last two approvers

llifezou & kernelogic

Rule to calculate the allocation request amount

100% of weekly dc amount requested

DataCap allocation requested

400TiB

Total DataCap granted for client so far

200TiB

Datacap to be granted to reach the total amount requested by the client (5PiB)

4.80PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
400 5 200TiB 22.5 5.71TiB
filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

Storage Provider Distribution

The below table shows the distribution of storage providers that have stored data for this client.

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

Since this is the 3rd allocation, the following restrictions have been relaxed:

✔️ Storage provider distribution looks healthy.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f02006841 Petaling Jaya, Selangor, MY
Celcom Axiata Berhad
24.00 TiB 22.17% 24.00 TiB 0.00%
f01914268 Frankfurt am Main, Hesse, DE
QUANTIL NETWORKS INC
22.78 TiB 21.05% 22.78 TiB 0.00%
f01896036 New York City, New York, US
QUANTIL NETWORKS INC
17.56 TiB 16.22% 17.56 TiB 0.00%
f02024315 Bangkok, Bangkok, TH
SBN-IIG/AWN-IIG transit provider
24.41 TiB 22.55% 24.41 TiB 0.00%
f02013434 Hanoi, Hanoi, VN
VIETNAM POSTS AND TELECOMMUNICATIONS GROUP
19.50 TiB 18.01% 19.50 TiB 0.00%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

Since this is the 3rd allocation, the following restrictions have been relaxed:

✔️ Data replication looks healthy.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
1.41 TiB 1.41 TiB 1 1.30%
1.09 TiB 2.19 TiB 2 2.02%
2.19 TiB 6.56 TiB 3 6.06%
6.44 TiB 25.75 TiB 4 23.79%
14.47 TiB 72.34 TiB 5 66.83%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

However, this could be possible if all below clients use same software to prepare for the exact same dataset or they belong to a series of LDN applications for the same dataset.

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

1ane-1 commented 1 year ago

CID report looks good, i will suport it

cryptowhizzard commented 1 year ago

CID report looks good, i will suport it

Retrieval?

1ane-1 commented 1 year ago

Canceled Request

The following request has been canceled by the notary, thus should not be considered as valid anymore.

Message sent to Filecoin Network

bafy2bzaceckgys7qumkjiwl4mevze5ysqu6zyeenbok6ogujkn7tj3cikez3o

Address

f1s72ymkzujzb5hnxfaqkgekn37nbisu4pc26eh3i

Datacap Allocated

400.00TiB

Signer Address

f1mdk7s2vntzm6hu35yuo6vjubtrpfnb2awhgvrri

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceckgys7qumkjiwl4mevze5ysqu6zyeenbok6ogujkn7tj3cikez3o

cryptowhizzard commented 1 year ago

@1ane-1

Did you test retrieval?

cryptowhizzard commented 1 year ago

@1ane-1

I tested retrieval for you. This SP's don't support it.

Feb 7 07:35:04 proposals dealscanner-f01980914-f01970716: Recv 0 B, Paid 0 FIL, Open (New), 0s [1675554614710751788|0] Feb 7 07:35:04 proposals dealscanner-f01980914-f01970716: Recv 0 B, Paid 0 FIL, DealProposed (WaitForAcceptance), 3ms [1675554614710751788|0] Feb 7 07:35:04 proposals dealscanner-f01980914-f01128320: Recv 0 B, Paid 0 FIL, Open (New), 0s [1675554614710751789|0] Feb 7 07:35:04 proposals dealscanner-f01980914-f01128320: Recv 0 B, Paid 0 FIL, DealProposed (WaitForAcceptance), 3ms [1675554614710751789|0] Feb 7 07:35:05 proposals dealscanner-f01980914-f01970716: Recv 0 B, Paid 0 FIL, DealAccepted (Accepted), 632ms [1675554614710751788|0] Feb 7 07:35:05 proposals dealscanner-f01980914-f01970716: Recv 0 B, Paid 0 FIL, PaymentChannelSkip (Ongoing), 633ms [1675554614710751788|0] Feb 7 07:35:07 proposals dealscanner-f01980914-f01970716: Recv 0 B, Paid 0 FIL, ProviderCancelled (Cancelling), 3.142s [1675554614710751788|0] Feb 7 07:35:07 proposals dealscanner-f01980914-f01970716: Recv 0 B, Paid 0 FIL, CancelComplete (Cancelled), 3.143s [1675554614710751788|0]

1ane-1 commented 1 year ago

@cryptowhizzard I cancel the signature for further confirmation. Firstly i did the retrieval. Now it need to confirm again.

JackRipper1888 commented 1 year ago

I have no problem using lotus to retrieve data. What tool or CID are you using img_v2_1d1dcc01-3c1d-47df-b128-0baeea3707eh_MIDDLE_WEBP

JackRipper1888 commented 1 year ago

Your cid data is wrong. It doesn't belong to our project

cryptowhizzard commented 1 year ago

Cleaned up my comments. Indeed there was a bug in the retrieval script. Sorry for that. I fixed it and am re-testing as we speak.

JackRipper1888 commented 1 year ago

Thanks for checking, are you a notary? Can you sign it for me?

cryptowhizzard commented 1 year ago

Thanks for checking, are you a notary? Can you sign it for me?

Yes i am a notary. When i finished checking i will make my decision.

cryptowhizzard commented 1 year ago

Hi,

f01896036 -> ERROR: failed to find offer satisfying maxPrice: 0 FIL. Try increasing maxPrice

JackRipper1888 commented 1 year ago

I have contacted this SP to make adjustments and can try again

cryptowhizzard commented 1 year ago

I did so, however the retrieval is painfully slow. It will take 72 hours to download the carfile. When i have it i will evaluate it.

JackRipper1888 commented 1 year ago

hi cryptowhizzard I confirmed to SPs that their searches are normal, if you have time, please help me approve @cryptowhizzard

NDLABS-Leo commented 1 year ago

Received dm from this client. Has done review, SPs are normal.

image image
NDLABS-Leo commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceahxm6yh7k3plvox7yquio7urz7tkfehp7eui2esrirrrr6pccody

Address

f1s72ymkzujzb5hnxfaqkgekn37nbisu4pc26eh3i

Datacap Allocated

400.00TiB

Signer Address

f1yayfsv6whu3rheviucvventj3y6t542xfpb47ei

Id

e797fcac-eebd-4507-b203-4650e7641790

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceahxm6yh7k3plvox7yquio7urz7tkfehp7eui2esrirrrr6pccody

herrehesse commented 1 year ago

@NDLABS-OFFICE As stated during multiple governance calls you were asked not to sign any applications, and first do your due diligence on 50 new applications. I am surprised to see you signing this application and would kindly suggest to revert your signature.

Due diligence is not what you performed and your statement "Has done review, SPs are normal." is not sufficient.

For transparency tagging: @cryptowhizzard @raghavrmadya @dkkapur @galen-mcandrew @Sunnyiscoming

NDLABS-Leo commented 1 year ago

@herrehesse Hi, as I said to Hid, we are also in the meeting, we can review and sign as normal. In addition, about collecting sp information during due diligence, today we will ask questions at the notary meeting, and we can communicate with the governance team at the notary meeting.

herrehesse commented 1 year ago

This is not correct. You were asked not to sign and do due diligence on 50 applications.

This is a huge violation and should not be taken lightly.

NDLABS-Leo commented 1 year ago

@herrehesse There may be some misunderstanding here, I hope the results will be discussed at the notary meeting today. And is your slack account Wijnand Schouten this?

herrehesse commented 1 year ago

@NDLABS-OFFICE No misunderstanding, we had this talk privately and you know that @raghavrmadya asked to hold off with signing any applications and do your due diligence on 50 applications.

I suggest to remove your signature from this application.

NDLABS-Leo commented 1 year ago

@herrehesse I have repeatedly confirmed that we can review and sign normally. RG proposed in slack that we should review the LDN more, but it did not explicitly say that we cannot sign. In addition, what doubts do you have about the current LDN?

herrehesse commented 1 year ago

This application is so clearly abusive I can't even......

They state "Hongkong" as their only storage region, and hence we can see global distribution.

Screenshot 2023-02-14 at 10 03 56

All of which is achieved by VPN.

@cryptowhizzard stated to be evaluating the data, and you quickly sign before he could get back with results.

Did you ask any question about who the SP's are and what businesses they run?

Do I need to continue?

NDLABS-Leo commented 1 year ago

@herrehesse As I mentioned just now, each notary will have its own review method. The governance team also said that notaries need to conduct review according to their own original rules. You can have your own review method, and so can we. In addition, this issue can be discussed in depth and concluded tonight. You can also ask this question at the meeting. Thanks.

cryptowhizzard commented 1 year ago

Retrieval + data verified.