filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application] SXX Future Data - Web3 Publicity Media #1293

Open XindyTan opened 1 year ago

XindyTan commented 1 year ago

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

Focusing on distributed big data storage, disaster data protection, encryption algorithm research, encrypted data product application and Web3.0 new-generation Internet research and development, SXX Future Data provides government enterprises and individual users with products and service system with data value as the core to meet the ever-expanding demand for mass data storage, management and application.

What is the primary source of funding for this project?

Income from customers and partners investment, the rest for business.

What other projects/ecosystem stakeholders is this project associated with?

NO other stakeholder. 

Use-case details

SXX is committed to real data storage and has set up a professional Ipfs&Filecoin promotion working group with more than 10 people. We are always committed to preaching for filecoin in any way that promotes Ipfs&Filecoin. The data to be stored this time is the content translated into Chinese by SXX, such as official documents or video conferences posted by official PL or Filecoin foundations. And SXX promoters pushing a lot of new video information about Ipfs&Filecoin across the major platforms. We are active in wechat, Toutiao, Baidu, Bilibili, Twitter and other relevant platforms at home and abroad.

Where was the data in this dataset sourced from?

Video data related to preaching for Ipfs&Filecoin.

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

Yes, our portfolio is open to users all over the world, and we hope that more people can appreciate our creations.

What is the expected retrieval frequency for this data?

About 10 more times per year, but this frequency depends on the later operation effect, and we will continue to work hard in this area.

For how long do you plan to keep this dataset stored on Filecoin?

Permanently stored.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

Mainland China, Hong Kong, Singapore.

How will you be distributing your data to storage providers? Is there an offline data transfer process?

Use online download + offline disk mail for data transmission.

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

1. choose reliable and highly reputable storage suppliers​.
2. Select a storage provider that can provide offline data transmission.
3.If SP can support fast retrieval, we will preferentially choose.

How will you be distributing deals across storage providers?

Average distribution.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Our own company has sufficient project funds.
large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

simonkim0515 commented 1 year ago

Datacap Request Trigger

Total DataCap requested


Expected weekly DataCap usage rate


Client address


large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address


Client address


DataCap allocation requested




newwebgroup commented 1 year ago

The Client contacts me through Slack About Kyc 1:Could you send an email to ? 2:Can you provide more detailed information about other storage providers participated in this program, such as you can list SPs you have contacted with at present?

sxxfuture-official commented 1 year ago

image f01959805 f01964215 f01981603 f01741924 f01402814

psh0691 commented 1 year ago

Can you prove the amount of data you currently have? Example) Screenshot of data capacity

XindyTan commented 1 year ago

The data we need to save on the chain includes the original shot, animation title, media material, and final film, etc. The data types include video data, picture data, and office text, and the total data amount is about 600T. We will keep multiple copies of our data (about6), and as our company works on the Web3 ecosystem, there will be more video production in the future, and we will continue to store FIL+ real data. @psh0691


psh0691 commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network




Datacap Allocated


Signer Address




You can check the status of the message here:

newwebgroup commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network




Datacap Allocated


Signer Address




You can check the status of the message here:

XindyTan commented 1 year ago

Dear notaries @psh0691 @newwebgroup , thanks for your help in the first round!

herrehesse commented 1 year ago

Dear Applicant,

Due to the increased amount of erroneous/wrong Filecoin+ data recently, on behalf of the entire community, we feel compelled to go deeper into datacap requests. Hereby to ensure that the overall value of the Filecoin network and Filecoin+ program increases and is not abused.

Please answer the questions below as comprehensively as possible.

Customer data

We expect that for the onboarding of customers with the scale of an LDN there would have been at least multiple email and perhaps several chat conversations preceding it. A single email with an agreement does not qualify here.

Should this only be soley for acquiring datacap this is of course out of the question. The customer must have a legitimate reason for wanting to use the Filecoin+ program which is intended as a program to store useful and public datasets on the network.

(As an intermediate solution Filecoin offers the FIL-E program or the website for business datasets that do not meet the requirements for a Filecoin+ dataset)

Files and Processing

Hopefully you understand the caution the overall community has for onboarding the wrong data. We understand the increased need for Filecoin+, however, we must not allow the program to be misused. Everything depends on a valuable and useful network, let's do our best to make this happen. Together.

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 2

Multisig Notary address


Client address


DataCap allocation requested




large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address


Client address


Last two approvers

newwebgroup & psh0691

Rule to calculate the allocation request amount

100% of weekly dc amount requested

DataCap allocation requested


Total DataCap granted for client so far


Datacap to be granted to reach the total amount requested by the client (3.5PiB)



Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
1183 4 50TiB 33.81 11.28TiB
filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

Storage Provider Distribution

The below table shows the distribution of storage providers that have stored data for this client.

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

Since this is the 3rd allocation, the following restrictions have been relaxed:

✔️ Storage provider distribution looks healthy.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01771695 Wuhan, Hubei, CN
CHINA UNICOM China169 Backbone
10.09 TiB 28.87% 10.09 TiB 0.00%
f01964215 Shenzhen, Guangdong, CN
6.13 TiB 17.52% 6.13 TiB 0.00%
f01753456 Hong Kong, Central and Western, HK
12.50 TiB 35.75% 12.50 TiB 0.00%
f01732345new Hong Kong, Central and Western, HK
6.25 TiB 17.87% 6.25 TiB 0.00%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

Since this is the 3rd allocation, the following restrictions have been relaxed:

✔️ Data replication looks healthy.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
1.13 TiB 1.13 TiB 1 3.22%
3.22 TiB 6.44 TiB 2 18.41%
5.22 TiB 15.66 TiB 3 44.77%
2.94 TiB 11.75 TiB 4 33.60%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

However, this could be possible if all below clients use same software to prepare for the exact same dataset or they belong to a series of LDN applications for the same dataset.

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

cryptowhizzard commented 1 year ago

Good morning,

I tried retrieving content here. Initially i got some working however it went downhill in a few minutes and after that i only got key not found errors. Not sure if that is something local here or SP related. Can someone else test retrievals?

lotus client retrieve --provider f01771695 uAXASIGYUEhwHs0YeNPeV4MTjQSYKleElbrfZaJf0zmnkme30 2023-01-14T10:51:40.636Z WARN rpc go-jsonrpc@v0.1.8/client.go:548 unmarshaling failed {"message": "{\"Err\":\"exhausted 5 attempts but failed to open stream, err: failed to dial 12D3KooWGjG3NDWERwRNb87YLZrxBWjhAqxqyKRH8q7pUdhazxVa:\n * [/ip4/] dial tcp4\u003e119.36.32.147:7001: i/o timeout\",\"Root\":null,\"Piece\":null,\"Size\":0,\"MinPrice\":\"\u003cnil\u003e\",\"UnsealPrice\":\"\u003cnil\u003e\",\"PricePerByte\":\"\u003cnil\u003e\",\"PaymentInterval\":0,\"PaymentIntervalIncrease\":0,\"Miner\":\"f01771695\",\"MinerPeer\":{\"Address\":\"f01771695\",\"ID\":\"12D3KooWGjG3NDWERwRNb87YLZrxBWjhAqxqyKRH8q7pUdhazxVa\",\"PieceCID\":null}}"} ERROR: RPC client error: unmarshaling result: failed to parse big string: '"\u003cnil\u003e"'

sxxfuture-official commented 1 year ago

Yes, we are trying to fix the retrieval problem, and we are working hard to optimize it. There will be frequent software iterative updates in the near future, and satisfactory results will be achieved soon.

cryptowhizzard commented 1 year ago

That is good to hear. Whenever you are ready, please let us know so we can evaluate your LDN again.

sxxfuture-official commented 1 year ago

@cryptowhizzard The repair work on retrieve has achieved phased results, and it is ready for testing. If there is a deal that cannot be retrieved, please let us know in time, and we will fix it as soon as possible. SP(f01771695 / f01964215 ) is ready , SP(f01753456 / f01732345) expected completed next week

lotus client retrieve --provider f01771695 uAXASIGYUEhwHs0YeNPeV4MTjQSYKleElbrfZaJf0zmnkme30 Recv 0 B, Paid 0 FIL, Open (New), 0s Recv 0 B, Paid 0 FIL, DealProposed (WaitForAcceptance), 3ms Recv 0 B, Paid 0 FIL, DealAccepted (Accepted), 29ms Recv 0 B, Paid 0 FIL, PaymentChannelSkip (Ongoing), 29ms Recv 12.54 KiB, Paid 0 FIL, BlocksReceived (Ongoing), 235ms Recv 506.6 KiB, Paid 0 FIL, BlocksReceived (Ongoing), 243ms Recv 518.6 KiB, Paid 0 FIL, BlocksReceived (Ongoing), 244ms Recv 1.506 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 257ms Recv 2.506 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 266ms Recv 3.506 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 280ms Recv 4.506 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 292ms Recv 5.506 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 300ms Recv 6.506 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 307ms Recv 7.506 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 318ms Recv 8.506 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 330ms Recv 9.506 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 338ms Recv 10.51 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 350ms Recv 11.51 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 360ms

lotus client retrieve --provider f01964215 uAXASIGYUEhwHs0YeNPeV4MTjQSYKleElbrfZaJf0zmnkme30 Recv 0 B, Paid 0 FIL, Open (New), 0s Recv 0 B, Paid 0 FIL, DealProposed (WaitForAcceptance), 2ms Recv 0 B, Paid 0 FIL, DealAccepted (Accepted), 16.666s Recv 0 B, Paid 0 FIL, PaymentChannelSkip (Ongoing), 16.666s Recv 12.54 KiB, Paid 0 FIL, BlocksReceived (Ongoing), 17.592s Recv 506.6 KiB, Paid 0 FIL, BlocksReceived (Ongoing), 19.877s Recv 518.6 KiB, Paid 0 FIL, BlocksReceived (Ongoing), 20.725s Recv 1.506 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 23.262s Recv 2.506 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 25.9s Recv 3.506 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 27.947s Recv 4.506 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 30.255s Recv 5.506 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 33.61s Recv 6.506 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 36.739s Recv 7.506 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 38.535s Recv 8.506 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 40.884s Recv 9.506 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 42.895s Recv 10.51 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 45.094s Recv 11.51 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 48.115s

liyunzhi-666 commented 1 year ago


filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

Storage Provider Distribution

The below table shows the distribution of storage providers that have stored data for this client.

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

Since this is the 3rd allocation, the following restrictions have been relaxed:

✔️ Storage provider distribution looks healthy.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01964215 Shenzhen, Guangdong, CN
6.13 TiB 12.90% 6.13 TiB 0.00%
f01771695 Hong Kong, Central and Western, HK
10.09 TiB 21.26% 10.09 TiB 0.00%
f01753456 Hong Kong, Central and Western, HK
12.50 TiB 26.33% 12.50 TiB 0.00%
f01777788 Hong Kong, Central and Western, HK
12.50 TiB 26.33% 12.50 TiB 0.00%
f01732345new Hong Kong, Central and Western, HK
6.25 TiB 13.17% 6.25 TiB 0.00%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

Since this is the 3rd allocation, the following restrictions have been relaxed:

✔️ Data replication looks healthy.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
1.13 TiB 2.25 TiB 2 4.74%
3.22 TiB 9.66 TiB 3 20.34%
5.22 TiB 20.88 TiB 4 43.98%
2.94 TiB 14.69 TiB 5 30.94%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

However, this could be possible if all below clients use same software to prepare for the exact same dataset or they belong to a series of LDN applications for the same dataset.

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

liyunzhi-666 commented 1 year ago

hi, @SXXFuture-Xindy I tried to retrieve it after receiving your request in the slack, but so far only one succeeded, the remaining three did not. 458799581df1cb951e7a28595436d10

sxxfuture-official commented 1 year ago
root@lotus-23 : lotus client query-ask f01964215
Ask: f01964215
Price per GiB: 0 FIL
Verified Price per GiB: 0 FIL
Max Piece size: 32 GiB
Min Piece size: 56 KiB

@liyunzhi-666 f01964215 resolved SP(f01753456 / f01732345) expected completed this week

liyunzhi-666 commented 1 year ago

f01964215, f1fe71b8a88318fa38d0ebac9b2f813 It still doesn't work for me. When everything is ready, please ping me here.

NDLABS-Leo commented 1 year ago


filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

Storage Provider Distribution

The below table shows the distribution of storage providers that have stored data for this client.

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

Since this is the 3rd allocation, the following restrictions have been relaxed:

✔️ Storage provider distribution looks healthy.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01964215 Shenzhen, Guangdong, CN
6.13 TiB 12.90% 6.13 TiB 0.00%
f01771695 Hong Kong, Central and Western, HK
10.09 TiB 21.26% 10.09 TiB 0.00%
f01753456 Hong Kong, Central and Western, HK
12.50 TiB 26.33% 12.50 TiB 0.00%
f01777788 Hong Kong, Central and Western, HK
12.50 TiB 26.33% 12.50 TiB 0.00%
f01732345new Hong Kong, Central and Western, HK
6.25 TiB 13.17% 6.25 TiB 0.00%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

Since this is the 3rd allocation, the following restrictions have been relaxed:

✔️ Data replication looks healthy.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
1.13 TiB 2.25 TiB 2 4.74%
3.22 TiB 9.66 TiB 3 20.34%
5.22 TiB 20.88 TiB 4 43.98%
2.94 TiB 14.69 TiB 5 30.94%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

However, this could be possible if all below clients use same software to prepare for the exact same dataset or they belong to a series of LDN applications for the same dataset.

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

NDLABS-Leo commented 1 year ago

image image image image image The connectivity of the nodes has been tested, and it seems to be normal. Please ask community members to perform retrieval tests on the files

XindyTan commented 1 year ago

Thanks to @NDLABS-OFFICE for testing our project, next I will find more notaries to test the file retrieval.

herrehesse commented 1 year ago

@NDLABS-OFFICE Can you give me the company names of the nodes above? @sxxfuture-official can you explain why all is stored on the same location?

Screenshot 2023-02-09 at 14 59 21
sxxfuture-official commented 1 year ago

Let me answer these questions, @NDLABS-OFFICE is just a notary who helps us test. SPs from SXXFuture, Toploong, WealegerTec

The problem of too concentrated regional distribution does exist at present, and we are aware of this, so we have specified a plan to add SPs in North America and Singapore in the next round of sealling.

NDLABS-Leo commented 1 year ago

@herrehesse The company of the node cannot be checked. The previously claimed node can be found on filfox, but now the company of the node is hidden.

herrehesse commented 1 year ago

@sxxfuture-official Thanks for your answer. I'd like to see your specified plan! Also can you help me with:

SP City Business Name

SXXFuture, Toploong, WealegerTec --> Who and which SP's are they?


Only FilFox data is not enough, we need business names and contact information to perform due diligence when things are not in place.

sxxfuture-official commented 1 year ago

@herrehesse Ok, I'll put together the form with the details, and I'll need your signature too.

herrehesse commented 1 year ago

Thank you.

sxxfuture-official commented 1 year ago

@herrehesse I have sent the information to slack 'Hidde Hoogland', please check and support our project, thank you.

herrehesse commented 1 year ago

@sxxfuture-official SP business names received in Slack PM. Thank you for the transparency.

Looking forward to retrieval / connection results and if they are OK I am willing to support.

Keep the good work going, thanks for your cooperation to both.

kernelogic commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network




Datacap Allocated


Signer Address




You can check the status of the message here:

kernelogic commented 1 year ago

By reading the communication history above and SXX's presentation in the community, willing to support.

herrehesse commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network




Datacap Allocated


Signer Address




You can check the status of the message here:

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 3

Multisig Notary address


Client address


DataCap allocation requested




large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address


Client address


Last two approvers

cryptowhizzard & kernelogic

Rule to calculate the allocation request amount

200% of weekly dc amount requested

DataCap allocation requested


Total DataCap granted for client so far


Datacap to be granted to reach the total amount requested by the client (3.5PiB)



Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
1583 5 100TiB 25.27 544GiB
filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

Storage Provider Distribution

The below table shows the distribution of storage providers that have stored data for this client.

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

Since this is the 4th allocation, the following restrictions have been relaxed:

✔️ Storage provider distribution looks healthy.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01964215 Shenzhen, Guangdong, CN
6.13 TiB 12.90% 6.13 TiB 0.00%
f01771695 Hong Kong, Central and Western, HK
10.09 TiB 21.26% 10.09 TiB 0.00%
f01753456 Hong Kong, Central and Western, HK
12.50 TiB 26.33% 12.50 TiB 0.00%
f01777788 Hong Kong, Central and Western, HK
12.50 TiB 26.33% 12.50 TiB 0.00%
f01732345new Hong Kong, Central and Western, HK
6.25 TiB 13.17% 6.25 TiB 0.00%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

Since this is the 4th allocation, the following restrictions have been relaxed:

✔️ Data replication looks healthy.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
1.13 TiB 2.25 TiB 2 4.74%
3.22 TiB 9.66 TiB 3 20.34%
5.22 TiB 20.88 TiB 4 43.98%
2.94 TiB 14.69 TiB 5 30.94%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

However, this could be possible if all below clients use same software to prepare for the exact same dataset or they belong to a series of LDN applications for the same dataset.[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

NiwanDao commented 1 year ago

The report looks good. What is the maximum number of copies you plan to store within a city?

sxxfuture-official commented 1 year ago

@xingjitansuo This round we will add other country and region. 80% of the data stored in the same region will be kept below 5 replicas.

Alex11801 commented 1 year ago

Very clean and straightforward information of this project. Willing to support it.

Alex11801 commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network




Datacap Allocated


Signer Address




You can check the status of the message here:

herrehesse commented 1 year ago

@sxxfuture-official what do you mean by "80% of the data stored in the same region will be kept below 5 replicas." ?

sxxfuture-official commented 1 year ago

If I removed the words "80% of ",can you understand ?

herrehesse commented 1 year ago

Yes but 5 replica's in the same region is normally not something we would accept. We always look for proper spreading across the globe. Why is this different for you?

herrehesse commented 1 year ago

I would suggest something like:

2 x Mainland China 2 x Hong Kong 2 x Singapore

(and VPN users must do their due diligence on proving their location)

sxxfuture-official commented 1 year ago

I mean below 5, not 5