filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application] South-Western Institute For Astronomy Research #81

Closed liyunzhi-666 closed 1 year ago

liyunzhi-666 commented 2 years ago

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

South-Western Institute For Astronomy Research at Yunnan University (SWIFAR-YNU) was founded in September 2017. It serves as a major part of the University’s special discipline zone for astronomy, set up according to the “Yunnan University Development Plan for World-Class University” and “Yunnan University Development Plan for World-Class Astronomy Discipline”.  
By bringing in internationally renowned scientists as well as excellent young researchers, SWIFAR-YNU is dedicated to advance fundamental research for astronomy. The Institute currently has eleven full-time faculty members, specialized in studies of the interstellar medium, stars and stellar clusters, the Milky Way and nearby galaxies, galaxies and galaxy clusters, the large scale structure of the universe and cosmology. The SWIFAR-YNU near-field cosmology research group is the leading scientific organizer and undertaker of the huge LAMOST Galactic Spectroscopic Surveys, whereas its deep-field cosmology research group spearheads the cosmological applications of the weak-lensing effects, and is actively engaged in a number of international collaborative projects such as the Euclid, KiDS, CFHT Stripe-82 and VOICE surveys. The group is also a key member of the first national cosmic microwave background project AliCPT.  
We provide a large amount of astronomical data, including charts, analytical data and pictures. It provides rich data for research institutions, universities, students and lovers of astronomy.

What is the primary source of funding for this project?

Our shareholders will support.
At the same time, the Helmsman team will also support us. [They are the winners of the Slingshot competition]( Their rich data storage experience will support us.

What other projects/ecosystem stakeholders is this project associated with?

There is no other projects or ecosystem stakeholders associated with.

Use-case details

Describe the data being stored onto Filecoin

We will upload some astronomical charts, analysis data and pictures to the Filecoin network.

Where was the data in this dataset sourced from?

It mainly comes from some data we have uploaded before. This is the address of the data website:
National astronomical data center:

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

Of course,like this link on the website:

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

Yes, this is a public website, and the data resources inside are also public data sets, anyone can access and download

What is the expected retrieval frequency for this data?

Once a month.

For how long do you plan to keep this dataset stored on Filecoin?

It will be a permanent archival storage.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

The current plan is to be in China, but it is not ruled out that it will be in other Asian countries in the future.

How will you be distributing your data to storage providers? Is there an offline data transfer process?

We mainly consider online storage transactions. If there is a strong recommendation from a storage provider, the possibility of offline data transmission is not ruled out.
It is expected to store 300TiB per week.

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

A storage provider with good credit will be our first consideration.
Secondly, we will also consider the location of the storage provider, transaction success rate, and network speed.

How will you be distributing deals across storage providers?

We will distribute equally

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Yes, we have.
Reputable storage providers in the community can work with us.
large-datacap-requests[bot] commented 2 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

galen-mcandrew commented 2 years ago

Multisig Notary requested

Total DataCap requested


Expected weekly DataCap usage rate


large-datacap-requests[bot] commented 2 years ago

**Multisig created and sent to RKH f01422999

large-datacap-requests[bot] commented 2 years ago

DataCap Allocation requested

Multisig Notary address


Client address


DataCap allocation requested


liyunzhi-666 commented 2 years ago

It's been almost a month. There's no next audit yet. @galen-mcandrew

MegTei commented 2 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network




Datacap Allocated


Signer Address


You can check the status of the message here:

AthSmith commented 2 years ago

For scientific research dataset, it is better to obtain the authorization of Chinese Government.

liyunzhi-666 commented 2 years ago

For scientific research dataset, it is better to obtain the authorization of Chinese Government.

Why does a publicly accessible and downloadable data set require the consent of the Chinese government?

AthSmith commented 2 years ago

Scientific research dataset is sensitively and conditionally shared in China. You may accidentally brake the law if you don't get any permission from local government.

liyunzhi-666 commented 2 years ago

Scientific research dataset is sensitively and conditionally shared in China. You may accidentally brake the law if you don't get any permission from local government.

The astronomical data mentioned in the application comes from this website ( and the data source and usage have been clearly stated in the application. The website also clearly states that relevant astronomical data can be downloaded and used. I think you are raising unreasonable questions without understanding. I hope you can ask questions after you understand the data usage on the website.

ozhtdong commented 2 years ago

For well-known reasons, we don't recommend using any dataset belonging to Chinese Govenment or Scientific Research, whether it is public or not in China.

dannyob commented 2 years ago

@liyunzhi-666 do you have a page or quote from the NADC website that conveys an open copyright license for the data? That would be very useful!

s0nik42 commented 2 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network




Datacap Allocated


Signer Address


You can check the status of the message here:

mr-spaghetti-code commented 2 years ago


It looks like you received 150TiBs of DataCap to date but have spent less than 20% of it so far.

We would love to understand if there's anything holding you back. We are working hard to make the data onboarding process easier for clients like you and your feedback is very valuable. If you have a moment, please fill in this survey:

If you have any feedback or would like to consult with an expert, please let me know.


João Fiadeiro Product Manager, Large Data Client Onboarding Protocol Labs

metagates-dev commented 2 years ago

hi, @mr-spaghetti-code
@liyunzhi-666 is our former colleague, and now I am in charge of this. From the first two weeks, we had some problems with packaging, causing a large number of failed orders. These two days are also testing related functions, and the packaging will continue after the test is successful.

filplus-checker commented 1 year ago

DataCap and CID Checker Report[^1]

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

⚠️ f01611281 has sealed 65.73% of total datacap.

⚠️ 35.62% of total deal sealed by f01611281 are duplicate data.

⚠️ f01828679 has sealed 34.27% of total datacap.

⚠️ 30.43% of total deal sealed by f01828679 are duplicate data.

⚠️ All storage providers are located in the same region.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01611281 Hong Kong, Central and Western, HK 39.02 TiB 65.73% 25.12 TiB 35.62%
f01828679new Hong Kong, Central and Western, HK 20.34 TiB 34.27% 14.15 TiB 30.43%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

⚠️ 100.00% of deals are for data replicated across less than 4 storage providers.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
11.72 TiB 14.66 TiB 1 24.69%
13.78 TiB 44.70 TiB 2 75.31%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

Since this is the 2nd allocation, the following restrictions have been relaxed:

⚠️ 35.62% of total deal sealed by f01611281 are duplicate data.

⚠️ 30.43% of total deal sealed by f01828679 are duplicate data.

⚠️ All storage providers are located in the same region.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01611281 Hong Kong, Central and Western, HK 39.02 TiB 65.73% 25.12 TiB 35.62%
f01828679new Hong Kong, Central and Western, HK 20.34 TiB 34.27% 14.15 TiB 30.43%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

Since this is the 2nd allocation, the following restrictions have been relaxed:

✔️ Data replication looks healthy.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
11.72 TiB 14.66 TiB 1 24.69%
13.78 TiB 44.70 TiB 2 75.31%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

Since this is the 2nd allocation, the following restrictions have been relaxed:

⚠️ 35.62% of total deal sealed by f01611281 are duplicate data.

⚠️ 30.43% of total deal sealed by f01828679 are duplicate data.

⚠️ All storage providers are located in the same region.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01611281 Hong Kong, Central and Western, HK 39.02 TiB 65.73% 25.12 TiB 35.62%
f01828679new Hong Kong, Central and Western, HK 20.34 TiB 34.27% 14.15 TiB 30.43%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

Since this is the 2nd allocation, the following restrictions have been relaxed:

✔️ Data replication looks healthy.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
11.72 TiB 14.66 TiB 1 24.69%
13.78 TiB 44.70 TiB 2 75.31%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

Since this is the 2nd allocation, the following restrictions have been relaxed:

⚠️ 35.62% of total deal sealed by f01611281 are duplicate data.

⚠️ 30.43% of total deal sealed by f01828679 are duplicate data.

⚠️ All storage providers are located in the same region.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01611281 Hong Kong, Central and Western, HK 39.02 TiB 65.73% 25.12 TiB 35.62%
f01828679new Hong Kong, Central and Western, HK 20.34 TiB 34.27% 14.15 TiB 30.43%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

Since this is the 2nd allocation, the following restrictions have been relaxed:

✔️ Data replication looks healthy.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
11.72 TiB 14.66 TiB 1 24.69%
13.78 TiB 44.70 TiB 2 75.31%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

Since this is the 2nd allocation, the following restrictions have been relaxed:

⚠️ 35.62% of total deal sealed by f01611281 are duplicate data.

⚠️ 30.43% of total deal sealed by f01828679 are duplicate data.

⚠️ All storage providers are located in the same region.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01611281 Hong Kong, Central and Western, HK 39.02 TiB 65.73% 25.12 TiB 35.62%
f01828679new Hong Kong, Central and Western, HK 20.34 TiB 34.27% 14.15 TiB 30.43%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

Since this is the 2nd allocation, the following restrictions have been relaxed:

✔️ Data replication looks healthy.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
11.72 TiB 14.66 TiB 1 24.69%
13.78 TiB 44.70 TiB 2 75.31%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

Since this is the 2nd allocation, the following restrictions have been relaxed:

⚠️ 35.62% of total deal sealed by f01611281 are duplicate data.

⚠️ 30.43% of total deal sealed by f01828679 are duplicate data.

⚠️ All storage providers are located in the same region.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01611281 Hong Kong, Central and Western, HK 39.02 TiB 65.73% 25.12 TiB 35.62%
f01828679new Hong Kong, Central and Western, HK 20.34 TiB 34.27% 14.15 TiB 30.43%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

Since this is the 2nd allocation, the following restrictions have been relaxed:

✔️ Data replication looks healthy.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
11.72 TiB 14.66 TiB 1 24.69%
13.78 TiB 44.70 TiB 2 75.31%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

Since this is the 2nd allocation, the following restrictions have been relaxed:

⚠️ 35.62% of total deal sealed by f01611281 are duplicate data.

⚠️ 30.43% of total deal sealed by f01828679 are duplicate data.

⚠️ All storage providers are located in the same region.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01611281 Hong Kong, Central and Western, HK 39.02 TiB 65.73% 25.12 TiB 35.62%
f01828679new Hong Kong, Central and Western, HK 20.34 TiB 34.27% 14.15 TiB 30.43%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

Since this is the 2nd allocation, the following restrictions have been relaxed:

✔️ Data replication looks healthy.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
11.72 TiB 14.66 TiB 1 24.69%
13.78 TiB 44.70 TiB 2 75.31%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

Since this is the 2nd allocation, the following restrictions have been relaxed:

⚠️ 35.62% of total deal sealed by f01611281 are duplicate data.

⚠️ 30.43% of total deal sealed by f01828679 are duplicate data.

⚠️ All storage providers are located in the same region.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01611281 Hong Kong, Central and Western, HK 39.02 TiB 65.73% 25.12 TiB 35.62%
f01828679new Hong Kong, Central and Western, HK 20.34 TiB 34.27% 14.15 TiB 30.43%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

Since this is the 2nd allocation, the following restrictions have been relaxed:

✔️ Data replication looks healthy.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
11.72 TiB 14.66 TiB 1 24.69%
13.78 TiB 44.70 TiB 2 75.31%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

Since this is the 2nd allocation, the following restrictions have been relaxed:

⚠️ 35.62% of total deal sealed by f01611281 are duplicate data.

⚠️ 30.43% of total deal sealed by f01828679 are duplicate data.

⚠️ All storage providers are located in the same region.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01611281 Hong Kong, Central and Western, HK 39.02 TiB 65.73% 25.12 TiB 35.62%
f01828679new Hong Kong, Central and Western, HK 20.34 TiB 34.27% 14.15 TiB 30.43%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

Since this is the 2nd allocation, the following restrictions have been relaxed:

✔️ Data replication looks healthy.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
11.72 TiB 14.66 TiB 1 24.69%
13.78 TiB 44.70 TiB 2 75.31%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

Since this is the 2nd allocation, the following restrictions have been relaxed:

⚠️ 35.62% of total deal sealed by f01611281 are duplicate data.

⚠️ 30.43% of total deal sealed by f01828679 are duplicate data.

⚠️ All storage providers are located in the same region.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01611281 Hong Kong, Central and Western, HK 39.02 TiB 65.73% 25.12 TiB 35.62%
f01828679new Hong Kong, Central and Western, HK 20.34 TiB 34.27% 14.15 TiB 30.43%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

Since this is the 2nd allocation, the following restrictions have been relaxed:

✔️ Data replication looks healthy.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
11.72 TiB 14.66 TiB 1 24.69%
13.78 TiB 44.70 TiB 2 75.31%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!

data-programs commented 1 year ago

This user’s identity has been verified through