filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
109 stars 62 forks source link

[DataCap Application] <Jiaxing Yangtze Delta Region Blockchain Technology Research Institute> - <Stan> #523

Closed roookie0823 closed 1 year ago

roookie0823 commented 2 years ago

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

Jiaxing Yangtze Delta Region Blockchain Technology Research Institute was established by the team of academician Whitfield Diffie, a Turing Award winner, jointly with Yangtze Delta Region Institute of Tsinghua University, Zhejiang and Jiaxing Science and Technology City.
Based on the trend of peak carbon dioxide emissions and carbon neutrality, the institute has successfully developed and launched “Stan” APP, which aims to carry out the industry chain business integrating carbon information service, carbon trading information service, carbon sink development and financial consulting. “Stan” brings together the market price to help users get the real-time trend of carbon market and facilitate users' scientific allocation of carbon assets; selects high-quality industry knowledge, digs deeply into industry information and integrates timely and comprehensive policy updates to make users quickly grasp the kernel of the industry; provides premium aggregation service to quickly match buyers and sellers of carbon assets and reduce their transaction costs.

What is the primary source of funding for this project?

The funding comes from the Institute itself, and is obtained by undertaking research projects in the field of carbon peaking and carbon neutrality, writing industry research reports, and providing technical consultants for carbon related trading platforms.

What other projects/ecosystem stakeholders is this project associated with?

No, we are the only entity that engaged in this project.

Use-case details

Describe the data being stored onto Filecoin

The platform data mainly includes policy documents in carbon peaking and carbon neutrality field, carbon sink industry report, real-time trading price of carbon asset, market trend, historical trading data, etc.

Where was the data in this dataset sourced from?

The raw data comes directly from the institutions that have these data or the mirrors they support, such as policy-issuing departments, reputable news organizations, scientific research institutions in the field of carbon sink, trading platforms of major exchanges, etc.

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

https://app.stanwap.com/mobile/Share/articleDetail/1556
https://app.stanwap.com/mobile/Share/articleDetail/1557
https://github.com/roookie0823/-.git

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

We confirm that the data is publicly available

What is the expected retrieval frequency for this data?

We will be sampling retrievability weekly. Based on the "carbon peak, carbon neutral" trend, it is expected that the demand for these data from researchers and developers will be on the rise in the coming years.
Therefore, it is expected that the data will be retrieved somewhat frequently, 1 - 2 times per month.

For how long do you plan to keep this dataset stored on Filecoin?

Forever. We want to be able to store it all the time.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

China

How will you be distributing your data to storage providers? Is there an offline data transfer process?

The data is currently stored in our institute’s private cloud. Because the amount of data is very large, we would like to transfer it offline.

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

First, we ensure that the distribution of the content will be as decentralized as possible. It will be stored in at least 8 SPs nodes. The more decentralized the data is stored, the more secure our data will be.
Secondly, we will require SPs to provide fast retrieval capabilities and to review the work once the storing is complete.

How will you be distributing deals across storage providers?

We will do our best to find service providers, prioritize the higher quality, stable and secure service providers, and then allocate them according to the allocation rules.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Yes, the institute currently undertakes a number of subject studies and industry research reports and has sufficient funding to start making deals.
large-datacap-requests[bot] commented 2 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

Fenbushi-Filecoin commented 2 years ago

We would like to support the application. Could you provide us with more details on the dataset?

roookie0823 commented 2 years ago

We would like to support the application. Could you provide us with more details on the dataset?

The platform data mainly includes policy documents in the field of carbon peaking and carbon neutrality, carbon sink industry research reports, real-time carbon trading prices, market trends, historical trading data, etc.

The data sources consist of the following three parts:

  1. As a research institute, we need to use crawler software to integrate national and local carbon-related data, including policy documents, news event and videos, etc. (about 150 TB of data), to support our research in the field of carbon neutrality and to form feasibility reports for reporting to local governments.
  2. As a carbon trading broker, we will build the "Alibaba" platform in the field of carbon trading, during which we need to review the corporate information, qualification documents, corporate video and trading log of both sides (about 100 TB of data), to help companies complete trading in a safe and credible manner.
  3. We will build an analysis and judgment platform for carbon trading, during which we need to download open source data from research institutions, such as State Forestry Administration, Tsinghua University, Central University of Finance and Economics, QuantData and other institutions that we cooperate with, 1.4 million enterprises' information (Average of about 100MB per enterprise, plus 20TB of open source data from institutions, a total of about 150TB), 100 urban forestry greening distribution maps and other information (about 100 TB of data in total), to build a carbon trading model for analysis of carbon emissions and carbon trading trends.

In total we have about 500 TB of data and will send 10 copies to be stored on Filecoin.

raghavrmadya commented 2 years ago

@sunnyiscoming please take a look

Sunnyiscoming commented 2 years ago

1.As you mentioned above. You have three parts of data. But there are only some data samples about a few articles related with the first part. Can you provide more data samples related with the following two parts.

As a carbon trading broker, we will build the "Alibaba" platform in the field of carbon trading, during which we need to review the corporate information, qualification documents, corporate video and trading log of both sides (about 100 TB of data), to help companies complete trading in a safe and credible manner. We will build an analysis and judgment platform for carbon trading, during which we need to download open source data from research institutions, such as State Forestry Administration, Tsinghua University, Central University of Finance and Economics, QuantData and other institutions that we cooperate with, 1.4 million enterprises' information (Average of about 100MB per enterprise, plus 20TB of open source data from institutions, a total of about 150TB), 100 urban forestry greening distribution maps and other information (about 100 TB of data in total), to build a carbon trading model for analysis of carbon emissions and carbon trading trends.

  1. You said you will be sampling retrievability weekly, Based on the "carbon peak, carbon neutral" trend, it is expected that the demand for these data from researchers and developers will be on the rise in the coming years. Therefore, it is expected that the data will be retrieved somewhat frequently, 1 - 2 times per month. Have you choose some platform fo this purpose?
  2. Can you provide more detailed information about storage providers distribution, such as you can list SPs you have contacted with at present?
raghavrmadya commented 2 years ago

@roookie0823 , please provide answers by next Friday to keep this application active.

Thanks. Also please send an email stating you have applied for LDN using an official email plus a business license to filplus-app-review@fil.org

Thanks.

roookie0823 commented 2 years ago

@raghavrmadya We have sent the LDN application email with the official email address and attached the business license, please check it.

@Sunnyiscoming The data we store is all taken from publicly available information, here are the answers to those three questions:

  1. Data Samples

Enterprise Information in the Trading Platform

Carbon Peaking and Carbon Neutrality Analysis Information

  1. Data Platform The following are some of the data source platforms we have chosen, such as Ecology and Environment of the People's Republic of China, National Energy Administration, Xinhuanet, National Development and Reform Commission, and Hainan Green Finance Institute.

  2. Contacted SPs The following are the SPs we have contacted at present: f0123535, f0521831, f01160668, f0179096, f01319368, f01203111

Sunnyiscoming commented 2 years ago

https://filrep.io/ According to the website, these storage providers are unreachable. Why would you choose them?

roookie0823 commented 2 years ago

@Sunnyiscoming

The SPs we contacted offline before had inadvertently provided some problematic IDs. We got back in touch with them and here are the new IDs they offered us: f0149768、f024014、f096974、f098706、f0107995、f020385

We haven't fully identified the SPs yet, but the SP we are looking to contact in the future should all be of the same type.

raghavrmadya commented 2 years ago

Can you provide more information on your relationship with Tsinghua? I cannot see any mention of you here - https://www.tsinghua-zj.edu.cn/

roookie0823 commented 2 years ago

@raghavrmadya Since there are some problems with the official website 'https://www.tsinghua-zj.edu.cn/', please open this link to check it out 'https://www.tsinghua-zj.edu.cn/technology/professional/qu-kuai-lian-ji-shu-yan-jiu-yuan-3', and we have also recorded a video about it.

Please download the link below to view more supporting documents that explains our relationship with Tsinghua, including the contract for the establishment of the Institute, the document for the name change of the Institute and the video recording of the official website. https://datacap.oss-cn-hangzhou.aliyuncs.com/Supporting%20documents%20for%20our%20relationship%20with%20Tsinghua.rar

Sunnyiscoming commented 2 years ago

@raghavrmadya The information provided confirms that: Jiaxing Jiahe Blockchain Technology Research Institute is allowed to change its name to Jiaxing Mayor Triangle Blockchain Technology Research Institute One of the founding teams of Jiaxing Mayor Triangle Blockchain Technology Research Institute is Zhejiang Tsinghua Yangtze River Delta Research Institute.

通过提供的资料确认: 嘉兴市嘉禾区块链技术研究院被允许更名为嘉兴市长三角区块链技术研究院 嘉兴市长三角区块链技术研究院的成立团队之一是浙江清华长三角研究院

raghavrmadya commented 2 years ago

Datacap Request Trigger

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

100TiB

Client address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

large-datacap-requests[bot] commented 2 years ago

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

DataCap allocation requested

50TiB

Fenbushi-Filecoin commented 2 years ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacedbioj3cept6o42dbzzh3ra5mke2qr6npsxotnw2turnidvesie4o

Address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

Datacap Allocated

50.00TiB

Signer Address

f1yqydpmqb5en262jpottko2kd65msajax7fi4rmq

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedbioj3cept6o42dbzzh3ra5mke2qr6npsxotnw2turnidvesie4o

xinaxu commented 2 years ago

I can support for the 1st tranche

xinaxu commented 2 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedrltbv4hstmk5l3xraasxwvig6w6oqa3atpqw3lmnzp457p2ytx2

Address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

Datacap Allocated

50.00TiB

Signer Address

f1k3ysofkrrmqcot6fkx4wnezpczlltpirmrpsgui

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedrltbv4hstmk5l3xraasxwvig6w6oqa3atpqw3lmnzp457p2ytx2

BDE-io commented 2 years ago

@roookie0823 Hi! Great to see you have gotten approval for DataCap. If you are looking for storage providers to store these data or have any questions, please visit #bigdata-exchange on Filecoin Slack or reply here.

We have strong demand from a diverse group of SPs, who are actively looking to onboard more data.

large-datacap-requests[bot] commented 2 years ago

DataCap Allocation requested

Request number 2

Multisig Notary address

f02049625

Client address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

DataCap allocation requested

100TiB

large-datacap-requests[bot] commented 2 years ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

Last two approvers

xinaxu & Fenbushi-Filecoin

Rule to calculate the allocation request amount

100% of weekly dc amount requested

DataCap allocation requested

100TiB

Total DataCap granted for client so far

50TiB

Datacap to be granted to reach the total amount requested by the client (5 PiB)

4.95PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
611 3 50TiB 51.88 9.15TiB
flyworker commented 2 years ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacecmw5jnrjxkhkg2pkpeen746ozhtcsqnfqipg5p53ybn3fp5l5s6s

Address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

Datacap Allocated

100.00TiB

Signer Address

f1hlubjsdkv4wmsdadihloxgwrz3j3ernf6i3cbpy

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecmw5jnrjxkhkg2pkpeen746ozhtcsqnfqipg5p53ybn3fp5l5s6s

kernelogic commented 2 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedwsitjjtfbpy7vp25u27lrdsswi2yrp6czuw6wz6vd44bv3yxb3a

Address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

Datacap Allocated

100.00TiB

Signer Address

f1yjhnsoga2ccnepb7t3p3ov5fzom3syhsuinxexa

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedwsitjjtfbpy7vp25u27lrdsswi2yrp6czuw6wz6vd44bv3yxb3a

large-datacap-requests[bot] commented 2 years ago

DataCap Allocation requested

Request number 3

Multisig Notary address

f02049625

Client address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

DataCap allocation requested

200TiB

large-datacap-requests[bot] commented 2 years ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

Last two approvers

kernelogic & flyworker

Rule to calculate the allocation request amount

200% of weekly dc amount requested

DataCap allocation requested

200TiB

Total DataCap granted for client so far

150TiB

Datacap to be granted to reach the total amount requested by the client (5 PiB)

4.85PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
3835 5 100TiB 25.11 23.84TiB
Fenbushi-Filecoin commented 2 years ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacebibueccn4et6xdpg5rdv3bk34msczehmupn7ashrrh2nll5hpgrw

Address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

Datacap Allocated

200.00TiB

Signer Address

f1yqydpmqb5en262jpottko2kd65msajax7fi4rmq

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebibueccn4et6xdpg5rdv3bk34msczehmupn7ashrrh2nll5hpgrw

NDLABS-Leo commented 2 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacecv7gladvrzyvk2iis4ce54o4gz72ujsdryi3efxc7xes7dnnop3y

Address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

Datacap Allocated

200.00TiB

Signer Address

f1yayfsv6whu3rheviucvventj3y6t542xfpb47ei

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecv7gladvrzyvk2iis4ce54o4gz72ujsdryi3efxc7xes7dnnop3y

large-datacap-requests[bot] commented 2 years ago

DataCap Allocation requested

Request number 4

Multisig Notary address

f02049625

Client address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

DataCap allocation requested

400TiB

large-datacap-requests[bot] commented 2 years ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

Last two approvers

not found & Fenbushi-Filecoin

Rule to calculate the allocation request amount

400% of weekly dc amount requested

DataCap allocation requested

400TiB

Total DataCap granted for client so far

350TiB

Datacap to be granted to reach the total amount requested by the client (5 PiB)

4.65PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
9651 6 200TiB 23.73 11.21TiB
kernelogic commented 2 years ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceawm4rjj3d5hirrzrwsvm6dfij56lichhx7sk23iiiyz2m5po5tys

Address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

Datacap Allocated

400.00TiB

Signer Address

f1yjhnsoga2ccnepb7t3p3ov5fzom3syhsuinxexa

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceawm4rjj3d5hirrzrwsvm6dfij56lichhx7sk23iiiyz2m5po5tys

psh0691 commented 2 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacea4p7yonzwe4lnwsmupxozlyl22wtb7qqqrpkr4te7s5efes4ly4k

Address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

Datacap Allocated

400.00TiB

Signer Address

f1qdko4jg25vo35qmyvcrw4ak4fmuu3f5rif2kc7i

Id

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacea4p7yonzwe4lnwsmupxozlyl22wtb7qqqrpkr4te7s5efes4ly4k

cbtan21 commented 1 year ago

@roookie0823 Hi, while doing a routine check for on-chain deal status for ldn #391 that was transacted on BDE, we noticed that the Filecoin address associated with this LDN has re-proposed the same pieceCIDS, where were previously proposed by #391. Thus, I like to seek clarification on this. Thanks.

cc @dkkapur @raghavrmadya Re-proposed pieceCIDs.csv

raghavrmadya commented 1 year ago

@roookie0823 , requested to respond to @cbtan21's comment to proceed. Until then, notaries are requested to stop signing further allocations

roookie0823 commented 1 year ago

@raghavrmadya @cbtan21 Hi, we are highly concerned about this matter, and after internal investigation, we found that an employee, due to carelessness,mistakenly sent another dataset that we had previously stored, and we also have many datasets stored. We have made internal corrections, optimized our workflow and trained our staff to prevent such cases from happening in the future.

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 5

Multisig Notary address

f02049625

Client address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

DataCap allocation requested

800TiB

Id

f1e1a4ae-dbea-43cb-a811-fa0963094872

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

Last two approvers

psh0691 & kernelogic

Rule to calculate the allocation request amount

800% of weekly dc amount requested

DataCap allocation requested

800TiB

Total DataCap granted for client so far

750TiB

Datacap to be granted to reach the total amount requested by the client (5 PiB)

4.26PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
20948 10 400TiB 16.85 98.81TiB
Fenbushi-Filecoin commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceadqevrauhk4x222xfo3e6dunbrc6ftrh2mmir3jzadlvvwb7emty

Address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

Datacap Allocated

800.00TiB

Signer Address

f1yqydpmqb5en262jpottko2kd65msajax7fi4rmq

Id

f1e1a4ae-dbea-43cb-a811-fa0963094872

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceadqevrauhk4x222xfo3e6dunbrc6ftrh2mmir3jzadlvvwb7emty

filplus-checker commented 1 year ago

DataCap and CID Checker Report[^1]

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

⚠️ f01082888 has unknown IP location.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01082888 Unknown 142.56 TiB 19.05% 142.56 TiB 0.00%
f0663311new Shanghai, Shanghai, CN 141.84 TiB 18.95% 141.84 TiB 0.00%
f01926686new Hangzhou, Zhejiang, CN 135.44 TiB 18.09% 135.16 TiB 0.21%
f0114153new Zhongshan, Guangdong, CN 98.56 TiB 13.17% 98.53 TiB 0.03%
f0148494new Hong Kong, Central and Western, HK 96.56 TiB 12.90% 96.31 TiB 0.26%
f07919 Hong Kong, Central and Western, HK 94.78 TiB 12.66% 94.78 TiB 0.00%
f01878201 Hangzhou, Zhejiang, CN 29.19 TiB 3.90% 29.00 TiB 0.64%
f01282328 Wuhan, Hubei, CN 9.38 TiB 1.25% 9.38 TiB 0.00%
f01966377new Yinchuan, Ningxia Hui Autonomous Region, CN 192.00 GiB 0.03% 192.00 GiB 0.00%
f0870354new Beijing, Beijing, CN 32.00 GiB 0.00% 32.00 GiB 0.00%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

✔️ Data replication looks healthy.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
76.09 TiB 76.09 TiB 1 10.17%
48.19 TiB 96.38 TiB 2 12.88%
32.00 GiB 96.00 GiB 3 0.01%
6.84 TiB 27.38 TiB 4 3.66%
42.63 TiB 213.16 TiB 5 28.48%
24.16 TiB 144.97 TiB 6 19.37%
19.56 TiB 137.19 TiB 7 18.33%
6.50 TiB 52.16 TiB 8 6.97%
96.00 GiB 1.13 TiB 9 0.15%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

⚠️ CID sharing has been observed.

Other Client Application Total Deals Affected Unique CIDs Verifier
f1fkh47gdwclmovclfz2kvorhhdofvgr5y7fjvkmi BigData Exchange 448.56 TiB 3,194 LDN v3 multisig
f1bstbq5bi72kyovhh7zoo2f6l6uivsjz4ey5dnqq FilSwan 168.00 TiB 2,446 LDN v3 multisig
f1o54sve7ede7im4caux3ug7lsyjmbue7ss3zzl6y FilSwan 113.47 TiB 1,246 LDN v3 multisig
f1r3d25hl2y7rqlsu2mgczdethy4qqjmkfdlmibfq NEXRAD - FilSwan 21.81 TiB 698 LDN v3 multisig
f3td7znsz2q6laexewfqogczo74xyiwgbysuf6i75
k2wrdqwzyvln7wvzmxaz3jvqzskwxqerdwaetdvue
jama
Unknown 10.25 TiB 297 Unknown
f3wzua4wihcouehv5datojpgalyapegvdfkhkqamx
dtr4tzcsn7kylkoaapmmxisbp3tzekgijb32lrswf
5t5q
Ninth Heaven Guild 1.00 TiB 32 Neo Ge
f1toz5izxdse43peqyd7zktmyqilvhf6u72z74gfq Starboard Networks 288.00 GiB 9 Steven Li

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

steven004 commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacebgvnzyqzem5e4w27iewyom53zk264xj4gz2co2qed74rizvqilmq

Address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

Datacap Allocated

800.00TiB

Signer Address

f1w2vyp4w6df44gbh4vxqle4w65zfrfnwhrl3hojy

Id

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebgvnzyqzem5e4w27iewyom53zk264xj4gz2co2qed74rizvqilmq

large-datacap-requests[bot] commented 1 year ago

We have found some problems in the information provided in the Approved Comment. We could not find the Filecoin address in the information provided in the comment We could not find the Datacap** allocated in the information provided in the comment

Please, take a look at the comment and edit the body of the comment providing all the required information.
steven004 commented 1 year ago

Requested by the application submitted, I am supporting this round of this application, after checked the storage distribution and encouraging public data storing.

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 6

Multisig Notary address

f02049625

Client address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

DataCap allocation requested

800TiB

Id

a29e99f9-3e28-46bd-9085-77f91dfaf8fa

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f3wn64jznjqmr3s3qt4tfwynxo6k73wdxkhd6wo2l4rwpjdyyddirah2ugfaibvlzz2asaj3ehsp7bwc2zvnia

Last two approvers

Fenbushi-Filecoin & psh0691

Rule to calculate the allocation request amount

800% of weekly dc amount requested

DataCap allocation requested

800TiB

Total DataCap granted for client so far

750TiB

Datacap to be granted to reach the total amount requested by the client (5 PiB)

4.26PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
24227 10 800TiB 19.09 0B
filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

Storage Provider Distribution

The below table shows the distribution of storage providers that have stored data for this client.

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

✔️ Storage provider distribution looks healthy.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f0870354new Beijing, Beijing, CN
China Unicom Beijing Province Network
32.00 GiB 0.00% 32.00 GiB 0.00%
f01282328 Wuhan, Hubei, CN
CHINA UNICOM China169 Backbone
9.38 TiB 1.25% 9.38 TiB 0.00%
f07919 Hong Kong, Central and Western, HK
China Unicom Global
94.78 TiB 12.66% 94.78 TiB 0.00%
f01966377new Yinchuan, Ningxia Hui Autonomous Region, CN
CHINANET-BACKBONE
192.00 GiB 0.03% 192.00 GiB 0.00%
f0114153new Zhongshan, Guangdong, CN
CT-HuNan-Changsha-IDC
98.56 TiB 13.17% 98.53 TiB 0.03%
f01082888 Shanghai, Shanghai, CN
Hangzhou Alibaba Advertising Co.,Ltd.
142.56 TiB 19.05% 142.56 TiB 0.00%
f0663311new Shanghai, Shanghai, CN
Hangzhou Alibaba Advertising Co.,Ltd.
141.84 TiB 18.95% 141.84 TiB 0.00%
f01878201 Hangzhou, Zhejiang, CN
Hangzhou Alibaba Advertising Co.,Ltd.
29.19 TiB 3.90% 29.00 TiB 0.64%
f01926686new Hangzhou, Zhejiang, CN
Jiangxi Jiujiang IDC
135.44 TiB 18.09% 135.16 TiB 0.21%
f0148494new Hong Kong, Central and Western, HK
Tencent Building, Kejizhongyi Avenue
96.56 TiB 12.90% 96.31 TiB 0.26%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

✔️ Data replication looks healthy.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
76.09 TiB 76.09 TiB 1 10.17%
48.19 TiB 96.38 TiB 2 12.88%
32.00 GiB 96.00 GiB 3 0.01%
6.84 TiB 27.38 TiB 4 3.66%
42.63 TiB 213.16 TiB 5 28.48%
24.16 TiB 144.97 TiB 6 19.37%
19.56 TiB 137.19 TiB 7 18.33%
6.50 TiB 52.16 TiB 8 6.97%
96.00 GiB 1.13 TiB 9 0.15%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

However, this could be possible if all below clients use same software to prepare for the exact same dataset or they belong to a series of LDN applications for the same dataset.

⚠️ CID sharing has been observed.

Other Client Application Total Deals Affected Unique CIDs Approvers
f1fkh47gdwclmovclfz2kvorhhdofvgr5y7fjvkmi BigData Exchange 448.56 TiB 3,194 11ane-1
6kernelogic
1liyunzhi-666
1MetaWaveInfo
3newwebgroup
1stcouldlisa
1xinaxu
f1bstbq5bi72kyovhh7zoo2f6l6uivsjz4ey5dnqq FilSwan 168.00 TiB 2,446 3cryptowhizzard
1IreneYoung
7kernelogic
2liyunzhi-666
1psh0691
f1o54sve7ede7im4caux3ug7lsyjmbue7ss3zzl6y FilSwan 113.47 TiB 1,246 3cryptowhizzard
3IreneYoung
1jamerduhgamer
1Joss-Hua
9kernelogic
2liyunzhi-666
1xingjitansuo
f1r3d25hl2y7rqlsu2mgczdethy4qqjmkfdlmibfq NEXRAD - FilSwan 21.81 TiB 698 1cryptowhizzard
2IreneYoung
1jamerduhgamer
5kernelogic
1liyunzhi-666
1Reiers
1xingjitansuo
f3td7znsz2q6laexewfqogczo74xyiwgbysuf6i75
k2wrdqwzyvln7wvzmxaz3jvqzskwxqerdwaetdvue
jama
Unknown 10.25 TiB 297 Unknown
f3wzua4wihcouehv5datojpgalyapegvdfkhkqamx
dtr4tzcsn7kylkoaapmmxisbp3tzekgijb32lrswf
5t5q
Ninth Heaven Guild 1.00 TiB 32
f1toz5izxdse43peqyd7zktmyqilvhf6u72z74gfq Starboard Networks 288.00 GiB 9

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

herrehesse commented 1 year ago

@roookie0823 can you explain to me why heavy CID sharing had been observed that look fraudulent and why all of the data is stored on the same continent?

Both against the rules of the program and would require a solution before the next trench.

herrehesse commented 1 year ago

@roookie0823 can you also explain to my why you would require 5 PiB of datacap for the specified data:

"The platform data mainly includes policy documents in carbon peaking and carbon neutrality field, carbon sink industry report, real-time trading price of carbon asset, market trend, historical trading data, etc."

Looking at the nature of the data it is impossible to need such a large amount. Please provide proof of size requirements.

@raghavrmadya @Sunnyiscoming

cryptowhizzard commented 1 year ago

Hello,

There is clearly something wrong as the data of this application is also shared with the Smitsonian access data AND nexrad data. Those dataset's are incompatible and cannot have matches.

This either means that you have sold your datacap or that you have not build any dataset yourself and have been using data car files of other party's to make deals with.

Both above are against FIL+ rules. Notary's should hold off on signing until proper clarification has been given and remedy to restore the damage done.

roookie0823 commented 1 year ago

@roookie0823 can you explain to me why heavy CID sharing had been observed that look fraudulent and why all of the data is stored on the same continent?

Both against the rules of the program and would require a solution before the next trench.

Regarding the issue of same CIDs, we fixed it as soon as we found it. Because our processed datasets are available for download via public IP, and If someone gets our links, we can't control other people using them. For data being stored in the same continent, the rules only prohibit the same SP from being stored in the same server room, but it can be stored in the same continent, and we also want the data to be stored all in Asia.

roookie0823 commented 1 year ago

@roookie0823 can you also explain to my why you would require 5 PiB of datacap for the specified data:

"The platform data mainly includes policy documents in carbon peaking and carbon neutrality field, carbon sink industry report, real-time trading price of carbon asset, market trend, historical trading data, etc."

Looking at the nature of the data it is impossible to need such a large amount. Please provide proof of size requirements.

@raghavrmadya @Sunnyiscoming

We have replied with the composition of our data in the previous comment, please refer to the following description:

"The platform data mainly includes policy documents in the field of carbon peaking and carbon neutrality, carbon sink industry research reports, real-time carbon trading prices, market trends, historical trading data, etc.

The data sources consist of the following three parts:

As a research institute, we need to use crawler software to integrate national and local carbon-related data, including policy documents, news event and videos, etc. (about 150 TB of data), to support our research in the field of carbon neutrality and to form feasibility reports for reporting to local governments. As a carbon trading broker, we will build the "Alibaba" platform in the field of carbon trading, during which we need to review the corporate information, qualification documents, corporate video and trading log of both sides (about 100 TB of data), to help companies complete trading in a safe and credible manner. We will build an analysis and judgment platform for carbon trading, during which we need to download open source data from research institutions, such as State Forestry Administration, Tsinghua University, Central University of Finance and Economics, QuantData and other institutions that we cooperate with, 1.4 million enterprises' information (Average of about 100MB per enterprise, plus 20TB of open source data from institutions, a total of about 150TB), 100 urban forestry greening distribution maps and other information (about 100 TB of data in total), to build a carbon trading model for analysis of carbon emissions and carbon trading trends. In total we have about 500 TB of data and will send 10 copies to be stored on Filecoin."

roookie0823 commented 1 year ago

Hello,

There is clearly something wrong as the data of this application is also shared with the Smitsonian access data AND nexrad data. Those dataset's are incompatible and cannot have matches.

This either means that you have sold your datacap or that you have not build any dataset yourself and have been using data car files of other party's to make deals with.

Both above are against FIL+ rules. Notary's should hold off on signing until proper clarification has been given and remedy to restore the damage done.

Question 1: Regarding the problem that the CID is the same as others, we found it before and dealt with it immediately, and no such situation has occurred since then, in addition, our processed datasets are available for download via public IP, so if someone gets our download link, we can't control others to use our datasets. Question 2: Regarding the issues you suspect, we do not have such problems and they will not occur in the future.

cryptowhizzard commented 1 year ago

Regarding the issue of same CIDs, we fixed it as soon as we found it. Because our processed datasets are available for download via public IP, and If someone gets our links, we can't control other people using them. For data being stored in the same continent, the rules only prohibit the same SP from being stored in the same server room, but it can be stored in the same continent, and we also want the data to be stored all in Asia.

Hold my beer. Are you telling me that FilSwan and BDE are the ones commiting the fraud and not you?

I can ask BDE for you if you like to join the conversation and FilSwan?

To have this confirmed you tell me that this application:

https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/391

Has used the data you build in their Smithsonian dataset without your permission , right???