filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Allocation] - The genetic testing data of pet-repulicate ( 1 / 2 ) #1121

Closed yvonne-23MF closed 10 months ago

yvonne-23MF commented 1 year ago

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

Chengdu 23mofang Biotechnology Co., Ltd. was established in March 2015 to provide biogenetic testing and life data analysis services.

According to the "2018 China Genetic Testing Industry Analysis" released by a third-party data agency—Analysys International,23mofang ranked first in the consumer-grade genetic testing field.Here is the news address:https://www.jiemian.com/article/2244613.html

And in In the Forbes list of "Chinese Fifty Most Innovative Companies",23mofang was rated as the "The innovator of Gene testing",check:https://m.huanqiu.com/article/9CaKrnK9GMf。
So far, 23mofang has accumulated 6 rounds of financing from SBCVC,Hanvon technology,Matrix Partners China,CD Capital,etc.23mofang is the largest domestic consumer-based company.

Based on leading genetic research results, 23mofang owns a professional bio-testing center and uses world-leading testing equipment for sequencing, our company provides users with highly readable and usable genetic test reports. 23mofang's self-built genetic testing center has been certified by the National Health and Family Planning Commission and meets all the standards of medical and clinical trial institutions.

For better scientific research,the company also bulid cooperation with Ministry of Education Key Laboratory of Contemporary Anthropology,the Medical Heredity Research Center of Southwest hospital of AMU,Kunming institute of zoology . cas.

23mofang's  pet genetic testing uses saliva to collect  sample and provides hundreds of test items such as pedigree analysis, genetic risk, genetic traits, genetic variant carriage, nutritional needs, medication response, exercise health, skin management and more. we collect and analyze pet genetic data for diagnosing or ruling out the suspected genetic diseases, and predicting the risk of specific diseases, so we can help users to understand the ancestral origin, physical characteristics and health conditions of pets.

Our pet genetic testing helps humans understand animals better and establish a harmonious ecological environment between humans and animals, while the huge pet test analysis data also greatly promotes the development of animal medical research and animal genetic science research.

Offical Website:  https://www.23mofang.com/
Offical Wechat:   23魔方
Offical Tiktok:   @23魔方溯源故事、@23魔方寻宗

image image (1) image (2) image (3) e2ebaa67bb3ac873edc4120cac997b6

What is the primary source of funding for this project?

The fund of data storage in this project is provided by our company.

What other projects/ecosystem stakeholders is this project associated with?

We have applied general verification datacap from notaries before, check #427、#428、#602,so we have bulid trust with Filecoin,adn we hope that we can storage our data in Filecoin steady.

Use-case details

Describe the data being stored onto Filecoin

Animal gene test samples, analysis results and gene source data. They are all private data.

Where was the data in this dataset sourced from?

All the data from our company

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

Parameter indicators of animal data:https://1drv.ms/u/s!Ah-lIf4OzDPabwLnSRrpcnWpidw?e=S0ShEU
List of test samples:https://1drv.ms/u/s!Ah-lIf4OzDPaaoJLZKOAl_D2JTg?e=LXOkdn
Genetic test data:https://1drv.ms/u/s!Ah-lIf4OzDPaa44krWKwYX5v0bA?e=dPctKh
Test result:https://1drv.ms/u/s!Ah-lIf4OzDPabgKczD-K1wnHnh4?e=Alk3MS
Profile of test object:https://1drv.ms/u/s!Ah-lIf4OzDPacDa76kVDEICqZvk?e=bqbJ4e
Test report:https://1drv.ms/u/s!Ah-lIf4OzDPacWPsY8xcDQ8SMLg?e=JHZfPh
 https://1drv.ms/u/s!Ah-lIf4OzDPacvU_l7MwJRrZ154?e=gEaxl6

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

NO

What is the expected retrieval frequency for this data?

None

For how long do you plan to keep this dataset stored on Filecoin?

Permanently.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

Global

How will you be distributing your data to storage providers? Is there an offline data transfer process?

online and off-line

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

We currently have an established relationship with MDM, and depending on the subsequent progress of storage and the number of seal, we will continue to investigate and choose a wider range of SPs for cooperation.
We investigate SP from the following two aspects:
--Experience in storing real data
Long-term and stable storage operation and maintenance capabilities.

How will you be distributing deals across storage providers?

We provide more than 2P of pet genetic data, and copy to 5 duplicates. According to the final SP numbers,we will make a reasonable distribution plan.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Yes, the funds for storage are ready.
yvonne-23MF commented 1 year ago

Hello @kevzak Thanks for your reminding.We've already started storing.Everything is going well so far.

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 3

Multisig Notary address

f01940930

Client address

f1tfxuwt2akoyizqpu64oy3k36lz77nfkfelt3tmy

DataCap allocation requested

400TiB

Id

f21a592f-2b7a-47f7-9b76-0a452956fbaf

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01940930

Client address

f1tfxuwt2akoyizqpu64oy3k36lz77nfkfelt3tmy

Last two approvers

xingjitansuo & PluskitOfficial

Rule to calculate the allocation request amount

200% of weekly dc amount requested

DataCap allocation requested

400TiB

Total DataCap granted for client so far

100TiB

Datacap to be granted to reach the total amount requested by the client (5PiB)

4.90PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
683 2 100TiB 99.85 23.31TiB
filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 50% of total datacap - f0442671: 99.88%

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

yvonne-23MF commented 1 year ago

Hi @kevzak Our allocation has been exhausted. What we can do to facilitate the next step.

kevzak commented 1 year ago

We'll move forward following the updated Notary SLA https://github.com/filecoin-project/notary-governance/discussions/807 and randomly assign:

@Fenbushi-Filecoin @bmcnabb25

Please review the application details for the Client, Data, and SPs and leave a comment. Please leave a comment by March 8th.

Also, FYI, here is SP storage information to review LINK

kevzak commented 1 year ago

Hi @kevzak Our allocation has been exhausted. What we can do to facilitate the next step.

@yvonne-23MF - as guidance, I recommend that you should provide an update for Notaries about your SP usage so far so they can understand. If the SPs used do not match your list, why not? Why only one SP used so far?

yvonne-23MF commented 1 year ago

@kevzak Thanks for your work.

  1. In the process of project application, some sp withdrew from the project, and we contacted new ones. We will pay attention to timely update in the comments. f01945296 temporarily adjusted, has been withdrawn from the project. The latest sp info is disclosed as follows:
SP ID SP org SP region
f0442671 zhongliang HK
f02031264 MDM SGP
  1. During the early allocation phase, each sp cannot be synchronized due to different network conditions and encapsulation capabilities. As the allocation increases, most sp will catch up, bringing the overall distribution into compliance.
kevzak commented 1 year ago

Let's move forward with the update notary SLA (https://github.com/filecoin-project/notary-governance/discussions/807) and assign randomly:

@TakiChain @Meibuy

Leave a comment as needed. Please comment by March 10th. Thanks! If you are not interested in signing please also leave a comment stating why.

yvonne-23MF commented 1 year ago

@TakiChain @Meibuy Please review our application , many thanks .

yvonne-23MF commented 1 year ago

@kevzak Will there be a new random allocation triggered today?

kevzak commented 1 year ago

Hello @TakiChain @Meibuy We did not hear from you.

Let's move forward with the update notary SLA (https://github.com/filecoin-project/notary-governance/discussions/807) and assign randomly:

@swatchliu @BlockMakeronline

Please leave a comment as needed. Please comment by March 15th. Thanks! If you are not interested in signing please also leave a comment stating why.

BlockMakeronline commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 50% of total datacap - f0442671: 99.94%

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

BlockMakeronline commented 1 year ago

@yvonne-23MF Is there any more sps you will cooperate with? Can you list them here?

yvonne-23MF commented 1 year ago

@BlockMakeronline Thanks for the timely review.
Currently, the SPs we have been in contact with are gradually on boarding.

SP ID SP org SP region
f0442671 zhongliang HK
f02031264 MDM SGP
f02051757 CCR AutoBAn BRA
f02048808 CCR AutoBAn BRA
f02032453 TAHALUF AL EMARAT TECHNICAL SOLUTIONS L.L.C DU
BlockMakeronline commented 1 year ago

Datacap distribution is limited in 2 sps now. It is not ok. But if you can solve this problem in the next round. I will approve you.

Destore2023 commented 1 year ago

image

Could you help to explain it and show a better resulit in the next round?

yvonne-23MF commented 1 year ago

@BlockMakeronline @swatchliu Yes.Currently, we have only started encapsulating for one node. Because the first batch has a smaller allocation, and the preparation work of other storage providers has not been completed yet, Therefore, the current dataset is concentrated on one node. Based on the results of our communication with all storage providers, they are already progressing with device procurement, data center contracting, and encapsulating machine allocation, and everything is normal. We guarantee that in the later encapsulating process, we will strictly follow the community's allocation rules for encapsulating, but the startup of nodes may have a sequential process. The second batch of shares should be encapsulated on no less than two nodes.

Destore2023 commented 1 year ago

OK, willing to support!

Destore2023 commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacebyhskazqaubsf2arzroosa6jxozcz4c22aszsypmjacoe6tagsdq

Address

f1tfxuwt2akoyizqpu64oy3k36lz77nfkfelt3tmy

Datacap Allocated

400.00TiB

Signer Address

f1yh6q3nmsg7i2sys7f7dexcuajgoweudcqj2chfi

Id

f21a592f-2b7a-47f7-9b76-0a452956fbaf

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebyhskazqaubsf2arzroosa6jxozcz4c22aszsypmjacoe6tagsdq

kevzak commented 1 year ago

Let's move forward with the update notary SLA (https://github.com/filecoin-project/notary-governance/discussions/807) and assign randomly:

@stcouldlisa @MRJAVAZHAO

Please leave a comment as needed. Please comment by March 17th. Thanks! If you are not interested in signing please also leave a comment stating why.

BlockMakeronline commented 1 year ago

I failed to approve this issue for some mistake happened during approving.

stcloudlisa commented 1 year ago
WX20230316-155851@2x
stcloudlisa commented 1 year ago

I communicated with customers via email, WeChat, and slack.I'm willing to support them temporarily this time

stcloudlisa commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacecsdlbb4caolapfcglb43s5sjwjou6ggahunw6qwnf2uqra4gc7xi

Address

f1tfxuwt2akoyizqpu64oy3k36lz77nfkfelt3tmy

Datacap Allocated

400.00TiB

Signer Address

f1jvvltduw35u6inn5tr4nfualyd42bh3vjtylgci

Id

f21a592f-2b7a-47f7-9b76-0a452956fbaf

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecsdlbb4caolapfcglb43s5sjwjou6ggahunw6qwnf2uqra4gc7xi

stcloudlisa commented 1 year ago

I will pay attention to the next cid report

yvonne-23MF commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

yvonne-23MF commented 1 year ago

Hi @kevzak @simonkim0515 @fabriziogianni7 More than 80% of the prior DataCap allocation has been used up, please help trigger the next round of allocation.

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 4

Multisig Notary address

f01940930

Client address

f1tfxuwt2akoyizqpu64oy3k36lz77nfkfelt3tmy

DataCap allocation requested

800TiB

Id

24980cfe-2c8d-461e-bb67-365e6cad8aeb

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

⚠️ 2 storage providers sealed more than 30% of total datacap - f02031264: 46.28%, f02058976: 30.55%

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

kevzak commented 1 year ago

Let's move forward with the update notary SLA (https://github.com/filecoin-project/notary-governance/discussions/807) and assign randomly:

@derricktan23 @liyunzhi-666

Please leave a comment as needed. Please comment by April 1st. Thanks! If you are not interested in signing please also leave a comment stating why.

liyunzhi-666 commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

⚠️ 2 storage providers sealed more than 30% of total datacap - f02031264: 46.28%, f02058976: 30.55%

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

liyunzhi-666 commented 1 year ago

Hi @yvonne-23MF , can you explain about the two warnings? And can we see better results in the next round? image

yvonne-23MF commented 1 year ago

@liyunzhi-666 thanks for your asking. The reported results are reasonable because different service providers have different speeds. From the comparison of two reports, the distribution situation has significantly improved. The service providers we contacted are also gradually joining and will participate in the upcoming work step by step. f02032453 DU f01986314 HK f0150816 CN f0119828 CN f02051757 Br f02048808 Br

derricktan23 commented 1 year ago

hi @liyunzhi-666 Considering the variability of speed among different service providers, the results are reasonable, It's encouraging to hear that the distribution situation shall improve significantly.

The client data files are able to retrieved by CID. Willing to support the application, hope to see more improvements next round as promosed.

image

Tom-OriginStorage commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceakx3aw4bfxvtdgfjysuq6fuyony42bz76ebrtt3mpbqbnuq6hl3a

Address

f1tfxuwt2akoyizqpu64oy3k36lz77nfkfelt3tmy

Datacap Allocated

800.00TiB

Signer Address

f1q6bpjlqia6iemqbrdaxr2uehrhpvoju3qh4lpga

Id

24980cfe-2c8d-461e-bb67-365e6cad8aeb

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceakx3aw4bfxvtdgfjysuq6fuyony42bz76ebrtt3mpbqbnuq6hl3a

liyunzhi-666 commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacecpjm3wkch2fj2zlev4vl4bfyxmq56rd77esfgfcggo2x3swr4jdy

Address

f1tfxuwt2akoyizqpu64oy3k36lz77nfkfelt3tmy

Datacap Allocated

800.00TiB

Signer Address

f1pszcrsciyixyuxxukkvtazcokexbn54amf7gvoq

Id

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecpjm3wkch2fj2zlev4vl4bfyxmq56rd77esfgfcggo2x3swr4jdy

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 4

Multisig Notary address

f01940930

Client address

f1tfxuwt2akoyizqpu64oy3k36lz77nfkfelt3tmy

DataCap allocation requested

800TiB

Id

b6a5eeaa-9b48-4f96-b417-2220c82e1750

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01940930

Client address

f1tfxuwt2akoyizqpu64oy3k36lz77nfkfelt3tmy

Rule to calculate the allocation request amount

400% of weekly dc amount requested

DataCap allocation requested

800TiB

Total DataCap granted for client so far

727595761418342891520.0YiB

Datacap to be granted to reach the total amount requested by the client (5PiB)

-8.79B

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
16909 7 800TiB 20.53 202.06TiB
kevzak commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

kevzak commented 1 year ago

Hello @yvonne-23MF this was your original SP list: https://github.com/filecoin-project/notary-governance/issues/634#issuecomment-1330169875 SPs in China: f01943941 F0150816 (SP explained that the node is migrating due to public policies, and we will update new in time) SP in Singapore : f01877862 SP in US: f01943910 SP in Japan : f01945296

This is the current SP list: https://github.com/data-preservation-programs/filplus-checker-assets/blob/main/filecoin-project/filecoin-plus-large-datasets/issues/1121/1684155266343.md

I only see one node that matches: f01945296

Can you explain what is the current storage plan? Who are the other SP nodes?

yvonne-23MF commented 1 year ago

Thanks, Kevin. Appreciate your prompt attention and question. Yes, since applying for the project in Oct last year, we have looked for many storage providers, and eventually the SPs who participated in the packaging and persisted until the completion of the project needed to meet the following conditions:

NiwanDao commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacecl5ktsynummorxfxn4d57sijofmjic4a653egx2ty7wifpdoice6

Address

f1tfxuwt2akoyizqpu64oy3k36lz77nfkfelt3tmy

Datacap Allocated

800.00TiB

Signer Address

f1a2lia2cwwekeubwo4nppt4v4vebxs2frozarz3q

Id

b6a5eeaa-9b48-4f96-b417-2220c82e1750

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecl5ktsynummorxfxn4d57sijofmjic4a653egx2ty7wifpdoice6

stcloudlisa commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

stcloudlisa commented 1 year ago

Looks like a pretty good improvement,willing to support