filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
109 stars 62 forks source link

[DataCap Application] <Organization> - <Project Name> #2032

Closed chenkun1223 closed 11 months ago

chenkun1223 commented 1 year ago

Data Owner Name

Protein Data Bank 3D Structural Biology Data

What is your role related to the dataset

Storage provider filling out application on behalf of the data owner

Data Owner Country/Region

United States

Data Owner Industry

Information, Media & Telecommunications


Social Media


Total amount of DataCap being requested


Expected size of single dataset (one copy)


Number of replicas to store


Weekly allocation of DataCap requested


On-chain address for first allocation


Data Type of Application


Custom multisig


No response

Share a brief history of your project and organization


Is this project associated with other projects/ecosystem stakeholders?


If answered yes, what are the other projects/ecosystem stakeholders

No response

Describe the data being stored onto Filecoin

The "Protein Data Bank (PDB) archive" was established in 1971 as the first open-access digital data archive in biology. It is a collection of three-dimensional (3D) atomic-level structures of biological macromolecules (i.e., proteins, DNA, and RNA) and their complexes with one another and various small-molecule ligands (e.g., US FDA approved drugs, enzyme co-factors). For each PDB entry (unique identifier: 1abc or PDB_0000001abc) multiple data files contain information about the 3D atomic coordinates, sequences of biological macromolecules, information about any small molecules/ligands present in the entry, details about the structure-determination experiment, authors and publication information, experimental data, and the wwPDB validation report. Additional content stored in the archive includes documentation, summary reports, and software (among others). The PDB is a jointly-managed core archive of the Worldwide Protein Data Bank partnership [RCSB Protein Data Bank (RCSB PDB,; Protein Data Bank in Europe (PDBe,; Protein Data Bank Japan (PDBj,; Electron Microscopy Data Bank (EMDB,; and Biological Magnetic Resonance Bank (BMRB,].

Where was the data currently stored in this dataset sourced from

AWS Cloud

If you answered "Other" in the previous question, enter the details here

No response

How do you plan to prepare the dataset


If you answered "other/custom tool" in the previous question, enter the details here

No response

Please share a sample of the data

aws s3 ls --no-sign-request s3://pdbsnapshots/

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data


For how long do you plan to keep this dataset stored on Filecoin

1.5 to 2 years

In which geographies do you plan on making storage deals

Greater China, Asia other than Greater China, North America

How will you be distributing your data to storage providers

HTTP or FTP server

How do you plan to choose storage providers

Slack, Partners

If you answered "Others" in the previous question, what is the tool or platform you plan to use

No response

If you already have a list of storage providers to work with, fill out their names and provider IDs below

f01172521  (owner)

How do you plan to make deals to your storage providers


If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline


large-datacap-requests[bot] commented 1 year ago

Thanks for your request! :exclamation: We have found some problems in the information provided.

chenkun1223 commented 1 year ago

感谢您的请求! ❗我们在所提供的信息中发现了一些问题。

  • 地址应以 f1、f2、f3 或 f4 开头 请查看请求并编辑问题的正文,提供所有必需的信息。

以修改 f1nn3reuxn3pbwjynoainndamja2o46nan5ch7hlq

Sunnyiscoming commented 1 year ago

Expected size of single dataset (one copy) 32GB

How many data in this dataset? Please modify this value.

f01172521 f01843178 f01877259 f02129771

Which are your nodes?

chenkun1223 commented 1 year ago

Expected size of single dataset (one copy) 32TB

How many data in this dataset? Please modify this value.

f01172521 (owner) f01843178 f01877259 f02129771

chenkun1223 commented 1 year ago


Sunnyiscoming commented 1 year ago

Expected size of single dataset (one copy)Number of replicas to store≠Total amount of DataCap being requested 30.75TB10≠3PiB

Can you explain about that?

chenkun1223 commented 1 year ago

Expected size of single dataset (one copy)_Number of replicas to store≠Total amount of DataCap being requested 30.75TB_10≠3PiB

Can you explain about that?

Our calculation error has been corrected

Total amount of DataCap being requested 300TB

Sunnyiscoming commented 1 year ago

Can you introduce your organizaion? Could you send an email to with your official domain in order to confirm your identity? Email name should includes the issue id #2032.

chenkun1223 commented 1 year ago

Send an email to

Sunnyiscoming commented 1 year ago

Please disclose the name of your organization and use your official domain send the message.

chenkun1223 commented 1 year ago


Hello, we are a small team focused on Filecoin technology research and development and services. We have not established a company or organization. We have managed some nodes and also collaborated with individuals, teams, and small companies. We urgently need this batch of 30TB data for node encapsulation. At the same time, we hope to have more professional data of this kind in the future. We can provide external services such as data download and preview, I look forward to Filecoin's data ecosystem becoming better and better.

Sunnyiscoming commented 1 year ago

What percentage of datacap will your nodes store?

chenkun1223 commented 1 year ago


My node will store 10% of the upper limit data, and then exchange and cooperate with other FIL storage providers to store all the data

large-datacap-requests[bot] commented 1 year ago

Deleting comment

@Sunnyiscoming hasn't the permissions to post this comment.

Please, contact the assignee of this issue.

Sunnyiscoming commented 1 year ago

Datacap Request Trigger

Total DataCap requested


Expected weekly DataCap usage rate


Client address


large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address


Client address


DataCap allocation requested




github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

chenkun1223 commented 1 year ago

We are still preparing

AlanGreaterheat commented 1 year ago

Would love to see small to medium sized SPs storing public datasets as the first round of willingness to support, I will continue to focus on data dispersion and fast retrieval support

sxxfuture-official commented 1 year ago

@chenkun1223 Please tell me the name of the organization applying for the current LDN, and the corresponding website Also, what is your relationship with the agency?

AlanGreaterheat commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network




Datacap Allocated


Signer Address




You can check the status of the message here:

chenkun1223 commented 1 year ago


Thank you very much. We will continue to work hard and hope that the filecoin ecosystem will become better and better

chenkun1223 commented 1 year ago

请告诉我申请当前LDN的组织的名称以及相应的网站 另外,您与该机构的关系是什么?

Hello, our organization is Handian Supercomputing Data, and this is our official website I am the operations engineer of the company

sxxfuture-official commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network




Datacap Allocated


Signer Address




You can check the status of the message here:

cryptowhizzard commented 1 year ago


chenkun1223 commented 11 months ago


large-datacap-requests[bot] commented 11 months ago

DataCap Allocation requested

Request number 2

Multisig Notary address


Client address


DataCap allocation requested




large-datacap-requests[bot] commented 11 months ago

Stats & Info for DataCap Allocation

Multisig Notary address


Client address


Rule to calculate the allocation request amount

100% of weekly dc amount requested

DataCap allocation requested


Total DataCap granted for client so far


Datacap to be granted to reach the total amount requested by the client (300TB)



Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
0 0 13.64TiB NaN 13.64TiB
w1259980480 commented 11 months ago


filplus-checker-app[bot] commented 11 months ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

⚠️ All retrieval success ratios are below 1%.

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 70% of total datacap - f02129771: 100.00%

⚠️ 1 storage providers have unknown IP location - f02129771

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 3 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval report.

joshua-ne commented 11 months ago


filplus-checker-app[bot] commented 11 months ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 70% of total datacap - f02129771: 100.00%

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 3 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval report.

zcfil commented 11 months ago


filplus-checker-app[bot] commented 11 months ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 70% of total datacap - f02129771: 100.00%

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 3 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval report.

github-actions[bot] commented 11 months ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

github-actions[bot] commented 11 months ago

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!

aggregation-and-compliance-bot[bot] commented 7 months ago
Client f02208521 does not follow the datacap usage rules. More info here. This application has been failing the requirements for 7 days. Please take appropiate action to fix the following DataCap usage problems. Criteria Treshold Reason
Percent of used DataCap stored with top provider < 75 The percent of Data from the client that is stored with their top provider is 100%. This should be less than 75%