filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
109 stars 62 forks source link

Speedium - NIH NCBI Sequence Read Archive [ 1 / 27 ] #488

Closed cryptowhizzard closed 1 year ago

cryptowhizzard commented 2 years ago

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

Speedium / Dcent has engaged in Slingshot starting 2.6. We have successfully stored more than 15 differerent datasets with 20+ different miners.

What is the primary source of funding for this project?

Company account

What other projects/ecosystem stakeholders is this project associated with?

No related association

Use-case details

Describe the data being stored onto Filecoin

NIH NCBI Sequence Read Archive (SRA) on AWS
The Sequence Read Archive (SRA), produced by the [National Center for Biotechnology Information (NCBI)](https://www.ncbi.nlm.nih.gov/) at the [National Library of Medicine (NLM)](http://nlm.nih.gov/) at the [National Institutes of Health (NIH)](http://www.nih.gov/), stores raw DNA sequencing data and alignment information from high-throughput sequencing platforms. 

Where was the data in this dataset sourced from?

AWS

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

https://registry.opendata.aws/ncbi-sra/

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

Yes.

What is the expected retrieval frequency for this data?

Multiple times p/y

For how long do you plan to keep this dataset stored on Filecoin?

18 months or longer

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

EU / US / Australia / Asia
We intend to store 10 replica's of this data. The dataset has a size of 13.4 PiB.

How will you be distributing your data to storage providers? Is there an offline data transfer process?

The data will be transferred both offline and online.

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

We have a few providers who have been working with us during Slingshot Restore program and we'd like to continue working with them for ongoing Slingshot competition.

How will you be distributing deals across storage providers?

Max 2 copy's per storage provider if stored on different miners / locations.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Yes, we have the resources.
large-datacap-requests[bot] commented 2 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 2 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

Sunnyiscoming commented 2 years ago

1.Total amount of DataCap being requested (between 500 TiB and 5 PiB): 135 PiB The total amount must be less than 5 PB, maybe you can reapply after one by one.

2.Weekly allocation of DataCap requested (usually between 1-100TiB): 2 PiB Can you list storage providers and their nodes for proving that you can store 2 PiB per week?

cryptowhizzard commented 2 years ago

@Sunnyiscoming That is ok. I will alter this application title then and structure it with 5 PiB batches. It means 27 batches total, so i will alter the title to [ 1 of 27 ] etc. and reference back to this original application as soon as we progress into the next one and are over 50% done with the previous one.

My plan is to distribute these towards Holon , DLTX , PikNik and i will also distribute these to whoever wants them.

large-datacap-requests[bot] commented 2 years ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
large-datacap-requests[bot] commented 2 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 2 years ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
large-datacap-requests[bot] commented 2 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

Sunnyiscoming commented 2 years ago

You have submitted a large number of dataset applications. I would like to confirm the data set size with you. But I can't see the size of amazon's public data set. Can you provide any evidence to prove it?

cryptowhizzard commented 2 years ago

Hello @Sunnyiscoming

The total size of this dataset is 13.14 PiB

You can retrieve that data with the following command according to AWS official source:

aws s3 ls --summarize --human-readable --recursive s3://bucket-name/

The bucket name and all information is on their public page :

https://registry.opendata.aws/ncbi-sra/

kevzak commented 2 years ago

@dkkapur @raghavrmadya FYI

large-datacap-requests[bot] commented 2 years ago

Deleting comment

@raghavrmadya hasn't the permissions to post this comment.

Please, contact the assignee of this issue.

raghavrmadya commented 2 years ago

Datacap Request Trigger

Total DataCap requested

5 PiB

Expected weekly DataCap usage rate

500 TiB

Client address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

large-datacap-requests[bot] commented 2 years ago

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

DataCap allocation requested

250TiB

kernelogic commented 2 years ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceczjbjh62cicnux7i7vqvk3kgs6ka7ntrjzoh3sgkgjucttiwbpp4

Address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

Datacap Allocated

250.00TiB

Signer Address

f1yjhnsoga2ccnepb7t3p3ov5fzom3syhsuinxexa

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceczjbjh62cicnux7i7vqvk3kgs6ka7ntrjzoh3sgkgjucttiwbpp4

MegTei commented 2 years ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacebdajq4eqg3y5y6rq43dby5u2hpymdckvpcpsucukemciuhcvcrbi

Address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

Datacap Allocated

250.00TiB

Signer Address

f1ystxl2ootvpirpa7ebgwl7vlhwkbx2r4zjxwe5i

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebdajq4eqg3y5y6rq43dby5u2hpymdckvpcpsucukemciuhcvcrbi

IreneYoung commented 2 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacebadh5hnq6tackw7owez4crnd55dlsmko434sohb2ii3qi3kth5qq

Address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

Datacap Allocated

250.00TiB

Signer Address

f1d4gmpqz3execjj2wvrxuuhvbms5mzh7t7yqrviq

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebadh5hnq6tackw7owez4crnd55dlsmko434sohb2ii3qi3kth5qq

raghavrmadya commented 2 years ago

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

DataCap allocation requested

250TiB

raghavrmadya commented 2 years ago

@IreneYoung @MegTei , could you please propose and approve this again?

kernelogic commented 2 years ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacecrgqdv2gevf2zgvnhtge6tgmhdfrkmkb6hyyq4ukq27wrtbd4ki2

Address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

Datacap Allocated

250.00TiB

Signer Address

f1yjhnsoga2ccnepb7t3p3ov5fzom3syhsuinxexa

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecrgqdv2gevf2zgvnhtge6tgmhdfrkmkb6hyyq4ukq27wrtbd4ki2

s0nik42 commented 2 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacect5ucnv5xchr66ejtkulrnd2drzpyqrixsj27r4jl3c66inefrzi

Address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

Datacap Allocated

250.00TiB

Signer Address

f1wxhnytjmklj2czezaqcfl7eb4nkgmaxysnegwii

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacect5ucnv5xchr66ejtkulrnd2drzpyqrixsj27r4jl3c66inefrzi

large-datacap-requests[bot] commented 2 years ago

DataCap Allocation requested

Request number 3

Multisig Notary address

f02049625

Client address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

DataCap allocation requested

1000.0TiB

Id

1404eb4f-0694-4cce-b5b1-073f3e63da4a

large-datacap-requests[bot] commented 2 years ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

Last two approvers

s0nik42 & kernelogic

Rule to calculate the allocation request amount

200% of weekly dc amount requested

DataCap allocation requested

1000.0TiB

Total DataCap granted for client so far

250TiB

Datacap to be granted to reach the total amount requested by the client (5 PiB)

4.75PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
6228 3 250TiB 36.13 60.65TiB
psh0691 commented 2 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzaced7pdfi2aguynv4agvn4om6aqaocjatgy56vzerfnveow3r5yw2ow

Address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

Datacap Allocated

250.00TiB

Signer Address

f1qdko4jg25vo35qmyvcrw4ak4fmuu3f5rif2kc7i

Id

1404eb4f-0694-4cce-b5b1-073f3e63da4a

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaced7pdfi2aguynv4agvn4om6aqaocjatgy56vzerfnveow3r5yw2ow

psh0691 commented 2 years ago

Why is "Request Approved" first instead of "Request Proposed"?

large-datacap-requests[bot] commented 2 years ago

DataCap Allocation requested

Request number 4

Multisig Notary address

f02049625

Client address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

DataCap allocation requested

1.95PiB

Id

e0d7d4e4-6aaf-4018-9f94-88e491c9026d

large-datacap-requests[bot] commented 2 years ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

Last two approvers

psh0691 & s0nik42

Rule to calculate the allocation request amount

400% of weekly dc amount requested

DataCap allocation requested

1.95PiB

Total DataCap granted for client so far

250TiB

Datacap to be granted to reach the total amount requested by the client (5 PiB)

4.75PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
6454 3 1000.0TiB 36.71 49.65TiB
kernelogic commented 2 years ago

Faucet is empty now - signing this will cause error anyways.

IreneYoung commented 2 years ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacec7y6sgzgiagvkar7fgaryzadhimd5bzbo6xrmbgz4dzpbjmsebzq

Address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

Datacap Allocated

1.95PiB

Signer Address

f1d4gmpqz3execjj2wvrxuuhvbms5mzh7t7yqrviq

Id

e0d7d4e4-6aaf-4018-9f94-88e491c9026d

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacec7y6sgzgiagvkar7fgaryzadhimd5bzbo6xrmbgz4dzpbjmsebzq

liyunzhi-666 commented 2 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedttitvruluqz6k4wsrbfdya5o66zdn2z7vjkbepi2umsdi6y2g3i

Address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

Datacap Allocated

1.95PiB

Signer Address

f1pszcrsciyixyuxxukkvtazcokexbn54amf7gvoq

Id

e0d7d4e4-6aaf-4018-9f94-88e491c9026d

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedttitvruluqz6k4wsrbfdya5o66zdn2z7vjkbepi2umsdi6y2g3i

filplus-checker commented 1 year ago

DataCap and CID Checker Report[^1]

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

⚠️ f01199430 has sealed 30.56% of total datacap.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01199430 Heerhugowaard, North Holland, NL 338.50 TiB 30.56% 331.75 TiB 1.99%
f01952350 Maywood Park, Oregon, US 238.13 TiB 21.50% 236.38 TiB 0.73%
f01944347 Maywood Park, Oregon, US 226.63 TiB 20.46% 226.63 TiB 0.00%
f01786387 Heerhugowaard, North Holland, NL 197.85 TiB 17.86% 193.23 TiB 2.34%
f01937642 Heerhugowaard, North Holland, NL 64.63 TiB 5.83% 61.50 TiB 4.84%
f01201327 Heerhugowaard, North Holland, NL 41.89 TiB 3.78% 41.89 TiB 0.00%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

⚠️ 100.00% of deals are for data replicated across less than 4 storage providers.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
551.44 TiB 557.60 TiB 1 50.34%
225.90 TiB 458.05 TiB 2 41.35%
29.38 TiB 91.97 TiB 3 8.30%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

⚠️ CID sharing has been observed.

Other Client Application Total Deals Affected Unique CIDs Verifier
f1z7jogzx4x42wtilzb4lu6iotlad5rptt2acbzpi Speedium network 29.95 TiB 909 LDN v3 multisig

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 5

Multisig Notary address

f02049625

Client address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

DataCap allocation requested

1.58PiB

Id

bcd85d5b-2b7e-44d8-89fd-1fe74c462697

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

Last two approvers

liyunzhi-666 & IreneYoung

Rule to calculate the allocation request amount

800% of weekly dc amount requested

DataCap allocation requested

1.58PiB

Total DataCap granted for client so far

2.19PiB

Datacap to be granted to reach the total amount requested by the client (5 PiB)

2.80PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
55683 18 1.95PiB 27.50 495.87TiB
filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

Storage Provider Distribution

The below table shows the distribution of storage providers that have stored data for this client.

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

✔️ Storage provider distribution looks healthy.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01156835 Sydney, New South Wales, AU
Anycast Global Backbone
7.91 TiB 0.49% 7.44 TiB 5.93%
f01208632 Sydney, New South Wales, AU
Anycast Global Backbone
7.69 TiB 0.47% 7.34 TiB 4.47%
f01157271 Sydney, New South Wales, AU
Anycast Global Backbone
6.50 TiB 0.40% 5.94 TiB 8.65%
f01208803 Sydney, New South Wales, AU
Anycast Global Backbone
5.39 TiB 0.33% 4.61 TiB 14.49%
f01156975 Sydney, New South Wales, AU
Anycast Global Backbone
4.44 TiB 0.27% 4.41 TiB 0.70%
f01157249 Sydney, New South Wales, AU
Anycast Global Backbone
4.34 TiB 0.27% 4.31 TiB 0.72%
f01157027 Sydney, New South Wales, AU
Anycast Global Backbone
4.31 TiB 0.26% 4.03 TiB 6.52%
f01157018 Sydney, New South Wales, AU
Anycast Global Backbone
2.97 TiB 0.18% 2.97 TiB 0.00%
f01156901 Sydney, New South Wales, AU
Anycast Global Backbone
1.52 TiB 0.09% 1.52 TiB 0.00%
f01156538 Sydney, New South Wales, AU
Anycast Global Backbone
896.00 GiB 0.05% 896.00 GiB 0.00%
f01972364new Aloha, Oregon, US
Flexential Colorado Corp.
250.29 TiB 15.38% 250.29 TiB 0.00%
f01952350 Aloha, Oregon, US
Flexential Colorado Corp.
238.13 TiB 14.63% 236.38 TiB 0.73%
f01944347 Aloha, Oregon, US
Flexential Colorado Corp.
226.63 TiB 13.93% 226.63 TiB 0.00%
f01972376new Aloha, Oregon, US
Flexential Colorado Corp.
79.28 TiB 4.87% 79.25 TiB 0.04%
f01199430 Heerhugowaard, North Holland, NL
Wijnand Schouten trading as Speedium
413.16 TiB 25.39% 406.41 TiB 1.63%
f01786387 Heerhugowaard, North Holland, NL
Wijnand Schouten trading as Speedium
197.85 TiB 12.16% 193.23 TiB 2.34%
f01201327 Heerhugowaard, North Holland, NL
Wijnand Schouten trading as Speedium
111.46 TiB 6.85% 111.46 TiB 0.00%
f01937642 Heerhugowaard, North Holland, NL
Wijnand Schouten trading as Speedium
64.63 TiB 3.97% 61.50 TiB 4.84%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

⚠️ 91.48% of deals are for data replicated across less than 4 storage providers.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
545.67 TiB 547.29 TiB 1 33.63%
378.37 TiB 762.05 TiB 2 46.83%
57.64 TiB 179.30 TiB 3 11.02%
27.53 TiB 114.72 TiB 4 7.05%
4.63 TiB 24.00 TiB 5 1.47%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

However, this could be possible if all below clients use same software to prepare for the exact same dataset or they belong to a series of LDN applications for the same dataset.

⚠️ CID sharing has been observed.

Other Client Application Total Deals Affected Unique CIDs Approvers
f1z7jogzx4x42wtilzb4lu6iotlad5rptt2acbzpi Speedium network 44.17 TiB 1,341 1flyworker
1kernelogic
4MegTei
2psh0691
3Reiers
3s0nik42

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

kernelogic commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacebnwmdr2zkmz54keuk3qpiwwwma7ok4fgtccq7csjhsr7ry4kls5s

Address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

Datacap Allocated

1.58PiB

Signer Address

f1yjhnsoga2ccnepb7t3p3ov5fzom3syhsuinxexa

Id

bcd85d5b-2b7e-44d8-89fd-1fe74c462697

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebnwmdr2zkmz54keuk3qpiwwwma7ok4fgtccq7csjhsr7ry4kls5s

s0nik42 commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacecfz2vpztcwgv225dhf3vouxbiymg4zg76lt3zd7cmmxj6pwigwxw

Address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

Datacap Allocated

1.58PiB

Signer Address

f1wxhnytjmklj2czezaqcfl7eb4nkgmaxysnegwii

Id

bcd85d5b-2b7e-44d8-89fd-1fe74c462697

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecfz2vpztcwgv225dhf3vouxbiymg4zg76lt3zd7cmmxj6pwigwxw

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 6

Multisig Notary address

f02049625

Client address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

DataCap allocation requested

5.27TiB

Id

a8317333-5425-48a8-b670-5ebda377b084

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

Last two approvers

s0nik42 & kernelogic

Rule to calculate the allocation request amount

800% of weekly dc amount requested

DataCap allocation requested

5.27TiB

Total DataCap granted for client so far

2.19PiB

Datacap to be granted to reach the total amount requested by the client (5 PiB)

2.80PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
67072 18 1.58PiB 24.35 170.98TiB
filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

Storage Provider Distribution

The below table shows the distribution of storage providers that have stored data for this client.

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

✔️ Storage provider distribution looks healthy.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01156835 Sydney, New South Wales, AU
Anycast Global Backbone
9.16 TiB 0.47% 8.69 TiB 5.12%
f01208632 Sydney, New South Wales, AU
Anycast Global Backbone
7.69 TiB 0.39% 7.34 TiB 4.47%
f01157271 Sydney, New South Wales, AU
Anycast Global Backbone
7.59 TiB 0.39% 6.84 TiB 9.88%
f01157249 Sydney, New South Wales, AU
Anycast Global Backbone
6.38 TiB 0.33% 6.31 TiB 0.98%
f01156901 Sydney, New South Wales, AU
Anycast Global Backbone
6.08 TiB 0.31% 5.92 TiB 2.57%
f01208803 Sydney, New South Wales, AU
Anycast Global Backbone
5.39 TiB 0.28% 4.61 TiB 14.49%
f01156975 Sydney, New South Wales, AU
Anycast Global Backbone
4.44 TiB 0.23% 4.41 TiB 0.70%
f01157027 Sydney, New South Wales, AU
Anycast Global Backbone
4.31 TiB 0.22% 4.03 TiB 6.52%
f01157018 Sydney, New South Wales, AU
Anycast Global Backbone
3.95 TiB 0.20% 3.95 TiB 0.00%
f01156538 Sydney, New South Wales, AU
Anycast Global Backbone
896.00 GiB 0.04% 896.00 GiB 0.00%
f01972364new Aloha, Oregon, US
Flexential Colorado Corp.
368.63 TiB 18.86% 368.63 TiB 0.00%
f01972376new Aloha, Oregon, US
Flexential Colorado Corp.
251.89 TiB 12.89% 251.79 TiB 0.04%
f01952350 Aloha, Oregon, US
Flexential Colorado Corp.
238.13 TiB 12.18% 236.38 TiB 0.73%
f01944347 Aloha, Oregon, US
Flexential Colorado Corp.
226.63 TiB 11.60% 226.63 TiB 0.00%
f01199430 Heerhugowaard, North Holland, NL
Wijnand Schouten trading as Speedium
439.29 TiB 22.48% 432.54 TiB 1.54%
f01786387 Heerhugowaard, North Holland, NL
Wijnand Schouten trading as Speedium
197.85 TiB 10.12% 193.23 TiB 2.34%
f01201327 Heerhugowaard, North Holland, NL
Wijnand Schouten trading as Speedium
111.46 TiB 5.70% 111.46 TiB 0.00%
f01937642 Heerhugowaard, North Holland, NL
Wijnand Schouten trading as Speedium
64.63 TiB 3.31% 61.50 TiB 4.84%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

⚠️ 92.51% of deals are for data replicated across less than 4 storage providers.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
612.60 TiB 614.25 TiB 1 31.43%
498.76 TiB 1002.71 TiB 2 51.31%
61.48 TiB 190.98 TiB 3 9.77%
28.19 TiB 117.50 TiB 4 6.01%
5.56 TiB 28.91 TiB 5 1.48%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

However, this could be possible if all below clients use same software to prepare for the exact same dataset or they belong to a series of LDN applications for the same dataset.

⚠️ CID sharing has been observed.

Other Client Application Total Deals Affected Unique CIDs Approvers
f1z7jogzx4x42wtilzb4lu6iotlad5rptt2acbzpi Speedium network 44.17 TiB 1,341 1flyworker
1kernelogic
4MegTei
2psh0691
3Reiers
3s0nik42

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

kernelogic commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacecltrmbij32fck2vqvoj52to22xupqhebeqynzm7zjnuzkcvhj64e

Address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

Datacap Allocated

256.00TiB

Signer Address

f1yjhnsoga2ccnepb7t3p3ov5fzom3syhsuinxexa

Id

a8317333-5425-48a8-b670-5ebda377b084

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecltrmbij32fck2vqvoj52to22xupqhebeqynzm7zjnuzkcvhj64e

BDEio commented 1 year ago

@cryptowhizzard Hi! Great to see that you have gotten approval for DataCap! BDE is a verified deals auction house helping you to get paid storing your data with reliable storage providers. If you need any help, please get in touch.

NDLABS-Leo commented 1 year ago

f01972364 f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy baga6ea4seaqpq3e5n27iselfe6mlbyebro2miz2vibe6hjaegupgcc3bpesl2di

image
NDLABS-Leo commented 1 year ago

f01972376 f01156835 f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

image image
lvschouwen commented 1 year ago

@NDLABS-OFFICE Thanks for bringing this to our attention but you are doing the retrievals wrong :-)

MinerID f01972364 random retrieval from this set

# lotus state get-deal 20273117
{
  "Proposal": {
    "PieceCID": {
      "/": "baga6ea4seaqoq6d6mb2h3rtvc5jnkcwc7cphgs2jxbnvwp3juhkmylvp67u3aey"
    },
    "PieceSize": 34359738368,
    "VerifiedDeal": true,
    "Client": "f01880278",
    "Provider": "f01972364",
    "Label": "bafybeieq6ynhp23ji6nsk6bgnl7tizrg6toxj23zijvnhxnpy3qqhrzw7q",
    "StartEpoch": 2493000,
    "EndEpoch": 3984840,
    "StoragePricePerEpoch": "0",
    "ProviderCollateral": "10639918249664389",
    "ClientCollateral": "0"
  },
  "State": {
    "SectorStartEpoch": 2465534,
    "LastUpdatedEpoch": 2561117,
    "SlashEpoch": -1,
    "VerifiedClaim": 2788665
  }
}
# lotus client retrieve --provider f01972364 bafybeieq6ynhp23ji6nsk6bgnl7tizrg6toxj23zijvnhxnpy3qqhrzw7q tmp
Recv 0 B, Paid 0 FIL, Open (New), 0s [1674922066658078673|0]
Recv 0 B, Paid 0 FIL, DealProposed (WaitForAcceptance), 2ms [1674922066658078673|0]
Recv 0 B, Paid 0 FIL, DealAccepted (Accepted), 8.458s [1674922066658078673|0]
Recv 0 B, Paid 0 FIL, PaymentChannelSkip (Ongoing), 8.459s [1674922066658078673|0]
Recv 120 B, Paid 0 FIL, BlocksReceived (Ongoing), 8.459s [1674922066658078673|120]
Recv 481 B, Paid 0 FIL, BlocksReceived (Ongoing), 8.459s [1674922066658078673|481]
Recv 761 B, Paid 0 FIL, BlocksReceived (Ongoing), 8.988s [1674922066658078673|761]
Recv 50.75 KiB, Paid 0 FIL, BlocksReceived (Ongoing), 9.548s [1674922066658078673|51972]
^CERROR: Retrieval Timed Out

MinerID f01972376 random retrieval from this set

# lotus state get-deal 20630140
{
  "Proposal": {
    "PieceCID": {
      "/": "baga6ea4seaqnbtfl3u4pncsqjxfwj44onkryu6b3m6bzqvwnv3sz32ytuhe5kli"
    },
    "PieceSize": 34359738368,
    "VerifiedDeal": true,
    "Client": "f01880278",
    "Provider": "f01972376",
    "Label": "bafybeihmcpimebislocpf54xpcubtw7ph57cezmhuwf53amervryjqmrxi",
    "StartEpoch": 2504186,
    "EndEpoch": 3996026,
    "StoragePricePerEpoch": "0",
    "ProviderCollateral": "10784577865477392",
    "ClientCollateral": "0"
  },
  "State": {
    "SectorStartEpoch": 2475398,
    "LastUpdatedEpoch": 2561020,
    "SlashEpoch": -1,
    "VerifiedClaim": 3144896
  }
}
# lotus client retrieve --provider f01972376 bafybeihmcpimebislocpf54xpcubtw7ph57cezmhuwf53amervryjqmrxi tmp
Recv 0 B, Paid 0 FIL, Open (New), 0s [1674922066658078674|0]
Recv 0 B, Paid 0 FIL, DealProposed (WaitForAcceptance), 1ms [1674922066658078674|0]
Recv 0 B, Paid 0 FIL, DealAccepted (Accepted), 6s [1674922066658078674|0]
Recv 0 B, Paid 0 FIL, PaymentChannelSkip (Ongoing), 6.001s [1674922066658078674|0]
Recv 122 B, Paid 0 FIL, BlocksReceived (Ongoing), 6.002s [1674922066658078674|122]
Recv 251 B, Paid 0 FIL, BlocksReceived (Ongoing), 6.002s [1674922066658078674|251]
Recv 639 B, Paid 0 FIL, BlocksReceived (Ongoing), 6.42s [1674922066658078674|639]
Recv 50.63 KiB, Paid 0 FIL, BlocksReceived (Ongoing), 6.421s [1674922066658078674|51850]
^CERROR: Retrieval Timed Out

MinerID f01156835 seems to have some issues and we will work with this SP to get it resolved.

lvschouwen commented 1 year ago

Disclosure; the "Retrieval Timed Out" is because I issued a control + c because it is not my intention to download everything. Just a test to see if something is received.

lvschouwen commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

Storage Provider Distribution

The below table shows the distribution of storage providers that have stored data for this client.

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

⚠️ 33.57% of total deal sealed by f01208803 are duplicate data.

⚠️ 48.43% of total deal sealed by f01208189 are duplicate data.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01156975 Melbourne, Victoria, AU
Anycast Global Backbone
59.94 TiB 1.55% 49.66 TiB 17.15%
f01157271 Melbourne, Victoria, AU
Anycast Global Backbone
47.46 TiB 1.23% 45.68 TiB 3.75%
f01156901 Melbourne, Victoria, AU
Anycast Global Backbone
47.33 TiB 1.23% 40.55 TiB 14.33%
f01208632 Melbourne, Victoria, AU
Anycast Global Backbone
45.26 TiB 1.17% 43.95 TiB 2.90%
f01157018 Melbourne, Victoria, AU
Anycast Global Backbone
44.28 TiB 1.15% 42.72 TiB 3.53%
f01157027 Melbourne, Victoria, AU
Anycast Global Backbone
39.09 TiB 1.01% 37.37 TiB 4.40%
f01208803 Melbourne, Victoria, AU
Anycast Global Backbone
37.80 TiB 0.98% 25.11 TiB 33.57%
f01208189 Melbourne, Victoria, AU
Anycast Global Backbone
37.72 TiB 0.98% 19.45 TiB 48.43%
f01157249 Melbourne, Victoria, AU
Anycast Global Backbone
31.42 TiB 0.81% 30.55 TiB 2.78%
f01156835 Melbourne, Victoria, AU
Anycast Global Backbone
17.63 TiB 0.46% 17.01 TiB 3.54%
f01208154 Melbourne, Victoria, AU
Anycast Global Backbone
13.20 TiB 0.34% 13.13 TiB 0.47%
f01156538 Melbourne, Victoria, AU
Anycast Global Backbone
7.63 TiB 0.20% 7.63 TiB 0.00%
f022352 Oslo, Oslo, NO
Blix Solutions AS
67.97 TiB 1.76% 66.91 TiB 1.56%
f02000937 Chengdu, Sichuan, CN
China Mobile Communications Group Co., Ltd.
384.00 GiB 0.01% 384.00 GiB 0.00%
f01972376new Maywood Park, Oregon, US
Flexential Colorado Corp.
962.98 TiB 24.95% 962.32 TiB 0.07%
f01972364new Maywood Park, Oregon, US
Flexential Colorado Corp.
926.84 TiB 24.01% 926.84 TiB 0.00%
f01952350 Maywood Park, Oregon, US
Flexential Colorado Corp.
238.13 TiB 6.17% 236.38 TiB 0.73%
f01944347 Maywood Park, Oregon, US
Flexential Colorado Corp.
226.63 TiB 5.87% 226.63 TiB 0.00%
f01199430 Heerhugowaard, North Holland, NL
Wijnand Schouten trading as Speedium
582.29 TiB 15.09% 575.54 TiB 1.16%
f01786387 Heerhugowaard, North Holland, NL
Wijnand Schouten trading as Speedium
197.76 TiB 5.12% 193.13 TiB 2.34%
f01201327 Heerhugowaard, North Holland, NL
Wijnand Schouten trading as Speedium
134.84 TiB 3.49% 134.84 TiB 0.00%
f01937642 Heerhugowaard, North Holland, NL
Wijnand Schouten trading as Speedium
64.63 TiB 1.67% 61.50 TiB 4.84%
f01771403 Heerhugowaard, North Holland, NL
Wijnand Schouten trading as Speedium
28.50 TiB 0.74% 28.50 TiB 0.00%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

⚠️ 89.19% of deals are for data replicated across less than 4 storage providers.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
1.55 PiB 1.55 PiB 1 41.05%
528.35 TiB 1.04 PiB 2 27.57%
252.80 TiB 794.20 TiB 3 20.58%
35.22 TiB 149.22 TiB 4 3.87%
29.63 TiB 162.91 TiB 5 4.22%
16.59 TiB 105.16 TiB 6 2.72%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

However, this could be possible if all below clients use same software to prepare for the exact same dataset or they belong to a series of LDN applications for the same dataset.

⚠️ CID sharing has been observed.

Other Client Application Total Deals Affected Unique CIDs Approvers
f1z7jogzx4x42wtilzb4lu6iotlad5rptt2acbzpi Speedium network 44.17 TiB 1,341 1flyworker
1kernelogic
4MegTei
2psh0691
3Reiers
3s0nik42

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

thebeersmith commented 1 year ago

Just addressing the duplicate data on miners f01208803 and f01208189. Over the Xmas period, both these miners had issues while snapping. A small number of files had permission issues (i.e. could not be removed from the file system) and were not cleaned up from the ingest pipeline. While our ingest code normally performs a check to ensure a file has not been previously ingested, this check failed with these files. This has since been sorted.

herrehesse commented 1 year ago

Thank you for the explanation, lets check the CID report in a week so see the percentages improve.

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 7

Multisig Notary address

f02049625

Client address

f1mgnwoczfj25foxn4555wvwyak6rsynzy7z73azy

DataCap allocation requested

10.23GiB

Id

7d8fb6dc-5f8f-432a-b00f-8d3e98ae2558