filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application]Beijing Wanjie Data Technology Co., LTD #404

Closed WanjieData closed 1 year ago

WanjieData commented 2 years ago

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

Beijing Wanjie data technology co., LTD is a big data assets operators, focusing on big data artificial intelligence services, through the integration of aviation, railways, telecommunications operators, financial, consumer and other fields of the core data resources, for public security, government, Banks and other industries to provide professional data products and services.  

At the level of public security, the company is committed to creating cross-industry data integration products, helping public security customers improve their business management capabilities, and developing new business management directions with big data as a tool.  In order to provide "one standard and three real" data management ability as the entry point, the company provides the public security department with floating population insight, floating population access, floating population analysis and other standardized products.  At the same time, based on external data such as operators, large traffic, cameras and so on, research and judgment models of multiple security business dimensions are built to improve the ability of public security organs to identify high-risk events and groups.  

At government level, the company actively exploration, data application layer surface and the government is now in hainan, yunnan, hebei, henan, and many other provinces for the large data governance, accurate identification data such as poor business services, to help the government departments in the business for the production operation management ability and performance.  

The company is committed to integrating accurate data of various industries, providing data business application services for government, public security and large state-owned enterprises, driving data application with business, and innovating business with data application.  

The company has a rich development experience of the technical team, team members are from the first-class Internet big data companies, with years of development experience and data modeling ability, the company's core RESEARCH and development team of more than 20 people.  Team members have been engaged in data mining research, data analysis and processing, data model establishment and model verification for a long time, with strong data acquisition ability, tool product ability and business insight ability.  Since the establishment of the company team, the company has quickly launched a series of products, professional service team and efficient product ability attracted a number of excellent partners such as Xinhuanet, attention and investment from a number of well-known venture capital funds, the company has obtained several rounds of financing.  

Wanjie data is a national high-tech and Zhongguancun high-tech enterprise  

- It has passed ISO9001 and ISO27001 certification  
- Through the safety and other security certification  
- It owns more than ten software Copyrights, including data asset management platform, big data visualization platform, artificial intelligence, machine science technology and applications in government, finance, tourism, e-commerce, transportation and other fields.  
- NVDIA Jetson Global Eco-recommendation partner  
- NVDIA NPN member unit  
- NVDIA Nano Dev Kit Authorized Distributor  
- NVDIA EGX edge computing platform pioneer 

What is the primary source of funding for this project?

Investors and business revenues.

What other projects/ecosystem stakeholders is this project associated with?

None.

Use-case details

Describe the data being stored onto Filecoin

Business data generated during the operation of the company, including but not limited to product introduction, product service and service plan data;  
The data we intend to store will be desensitized. 

Where was the data in this dataset sourced from?

The data is obtained from the data provided by users and the analysis data generated during service processes. 

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

https://github.com/WanjieData/wanjiedata/commit/42db0f846c227958dc8c3e9d9d0bf79326007d5b

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

This is a public dataset.

What is the expected retrieval frequency for this data?

Once a year.

For how long do you plan to keep this dataset stored on Filecoin?

At least 2 years.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

Asia.

How will you be distributing your data to storage providers? Is there an offline data transfer process?

Both online and offline.

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

We will contact SP, and also accept SP to contact us.
Data will be stored in multiple SPs, ensuring distributed storage and data security.  
We periodically retrieve data to ensure that it is safely stored. 

How will you be distributing deals across storage providers?

We will choose more stable and secure storage suppliers first, and then allocate according to the allocation rules on this basis.  We hope that more storage providers can contact us and provide us with storage services. 

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Yes. 
large-datacap-requests[bot] commented 2 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

Kakkouii commented 2 years ago

According to Google Translate, you're cooperating with governments and involved in public security. Those data are highly sensitive and belong to personal privacy. Is that fine to store them publicly?

image
WanjieData commented 2 years ago

HI,@EGGRICE02 The data we intend to store in the Filecoin network is not related with government. We are aware of some data that we cannot store on the Filecoin network because we have to comply with laws and regulations before we store it. We intend to store other business data in the Filecoin network, including but not limited to product descriptions, product services, service plans, etc., which are publicly available and do not involve privacy.

Destore2023 commented 2 years ago

Thanks for your application, could you mail to filplus@fil.org and CC partner@bytebase.cn with your official domain mailbox in order to confirm your identity?

WanjieData commented 2 years ago

感谢您的申请,能否用您的官方域名邮箱发邮件至filplus@fil.org和CC partner@bytebase.cn以确认您的身份?

I‘ve sent the mail,please check it.

Destore2023 commented 2 years ago

image

WanjieData commented 2 years ago

image

Yes,sir.

Sunnyiscoming commented 2 years ago

@WanjieData Can you provide more data samples?

WanjieData commented 2 years ago

@WanjieData Can you provide more data samples?

Here are some additional samples, please check it. https://github.com/WanjieData/wanjiedata/blob/main/Supplementary%20document.zip

galen-mcandrew commented 2 years ago

Datacap Request Trigger

Total DataCap requested

2PiB

Expected weekly DataCap usage rate

100TiB

Client address

f1elncewt3sh356aop52uvcappblxmf6asbmhxlya

large-datacap-requests[bot] commented 2 years ago

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f1elncewt3sh356aop52uvcappblxmf6asbmhxlya

DataCap allocation requested

50TiB

MRJAVAZHAO commented 2 years ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacecfxlz72xruw5faoeth4cxeeydzrh4pca4ykm4yqagbh47zfrhudu

Address

f1elncewt3sh356aop52uvcappblxmf6asbmhxlya

Datacap Allocated

50.00TiB

Signer Address

f14gme3f52prtyzk6pblogrdd6b6ivp4swc6qmesi

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecfxlz72xruw5faoeth4cxeeydzrh4pca4ykm4yqagbh47zfrhudu

PluskitOfficial commented 2 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacecol7kacjc3yrg3u3b4ntskluvz2gnzva32gsh3afilmg5pv2rvma

Address

f1elncewt3sh356aop52uvcappblxmf6asbmhxlya

Datacap Allocated

50.00TiB

Signer Address

f1tgnlhtcmhwipfm7thsftxhn5k52velyjlazpvka

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecol7kacjc3yrg3u3b4ntskluvz2gnzva32gsh3afilmg5pv2rvma

large-datacap-requests[bot] commented 2 years ago

DataCap Allocation requested

Request number 2

Multisig Notary address

f02049625

Client address

f1elncewt3sh356aop52uvcappblxmf6asbmhxlya

DataCap allocation requested

100TiB

large-datacap-requests[bot] commented 2 years ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f1elncewt3sh356aop52uvcappblxmf6asbmhxlya

Last two approvers

PluskitOfficial & MRJAVAZHAO

Rule to calculate the allocation request amount

100% of weekly dc amount requested

DataCap allocation requested

100TiB

Total DataCap granted for client so far

50TiB

Datacap to be granted to reach the total amount requested by the client (2 PiB)

1.95PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
1114 2 50TiB 53.59 2.21TiB
MRJAVAZHAO commented 2 years ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceabbxacqj6b3ttdnrnmfwnjpueadc6hsaibjqasdnwf3rzwq23g24

Address

f1elncewt3sh356aop52uvcappblxmf6asbmhxlya

Datacap Allocated

100.00TiB

Signer Address

f14gme3f52prtyzk6pblogrdd6b6ivp4swc6qmesi

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceabbxacqj6b3ttdnrnmfwnjpueadc6hsaibjqasdnwf3rzwq23g24

MatrixStorage commented 2 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzaceadj7gkqlvqbnvzgdelz5k6cq5oacz2bj5zbp2dwtbxznpntrlmie

Address

f1elncewt3sh356aop52uvcappblxmf6asbmhxlya

Datacap Allocated

100.00TiB

Signer Address

f1tbxqwjxfyv7swsdin4einirlsfquv3vnmlapley

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceadj7gkqlvqbnvzgdelz5k6cq5oacz2bj5zbp2dwtbxznpntrlmie

large-datacap-requests[bot] commented 2 years ago

DataCap Allocation requested

Request number 3

Multisig Notary address

f02049625

Client address

f1elncewt3sh356aop52uvcappblxmf6asbmhxlya

DataCap allocation requested

200TiB

large-datacap-requests[bot] commented 2 years ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f1elncewt3sh356aop52uvcappblxmf6asbmhxlya

Last two approvers

MatrixStorage & MRJAVAZHAO

Rule to calculate the allocation request amount

200% of weekly dc amount requested

DataCap allocation requested

200TiB

Total DataCap granted for client so far

150TiB

Datacap to be granted to reach the total amount requested by the client (2 PiB)

1.85PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
4683 5 100TiB 30.79 3.18TiB
PluskitOfficial commented 2 years ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceb3y4tt25nvlcyxilk2dft6o3pponf3476xnkxuddjo64vahewzc2

Address

f1elncewt3sh356aop52uvcappblxmf6asbmhxlya

Datacap Allocated

200.00TiB

Signer Address

f1tgnlhtcmhwipfm7thsftxhn5k52velyjlazpvka

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceb3y4tt25nvlcyxilk2dft6o3pponf3476xnkxuddjo64vahewzc2

MRJAVAZHAO commented 2 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacectwny2yxbx5cic5pxzrqx6os6hgqwevb65vnh557rjfy2lup3tmg

Address

f1elncewt3sh356aop52uvcappblxmf6asbmhxlya

Datacap Allocated

200.00TiB

Signer Address

f14gme3f52prtyzk6pblogrdd6b6ivp4swc6qmesi

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacectwny2yxbx5cic5pxzrqx6os6hgqwevb65vnh557rjfy2lup3tmg

large-datacap-requests[bot] commented 2 years ago

DataCap Allocation requested

Request number 4

Multisig Notary address

f02049625

Client address

f1elncewt3sh356aop52uvcappblxmf6asbmhxlya

DataCap allocation requested

400TiB

large-datacap-requests[bot] commented 2 years ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f1elncewt3sh356aop52uvcappblxmf6asbmhxlya

Last two approvers

MRJAVAZHAO & PluskitOfficial

Rule to calculate the allocation request amount

400% of weekly dc amount requested

DataCap allocation requested

400TiB

Total DataCap granted for client so far

350TiB

Datacap to be granted to reach the total amount requested by the client (2 PiB)

1.65PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
11331 7 200TiB 20.33 416GiB
BlockMakeronline commented 2 years ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacebvh3i2utxgfmerpicy6xqm3xj6a6lj6qhgx2wjuma2b6533j6pd4

Address

f1elncewt3sh356aop52uvcappblxmf6asbmhxlya

Datacap Allocated

400.00TiB

Signer Address

f1o3twrcpwjtpcd4q36lpq4qmy2qfbgtyy5h6tsty

Id

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebvh3i2utxgfmerpicy6xqm3xj6a6lj6qhgx2wjuma2b6533j6pd4

filplus-checker commented 1 year ago

DataCap and CID Checker Report[^1]

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

⚠️ 34.94% of total deal sealed by f01938721 are duplicate data.

⚠️ 58.49% of total deal sealed by f01919535 are duplicate data.

⚠️ 41.15% of total deal sealed by f01851482 are duplicate data.

⚠️ 39.55% of total deal sealed by f01852023 are duplicate data.

⚠️ 44.40% of total deal sealed by f01852325 are duplicate data.

⚠️ 43.28% of total deal sealed by f01852664 are duplicate data.

⚠️ 41.33% of total deal sealed by f01852677 are duplicate data.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01938721 Hong Kong, Central and Western, HK 72.00 TiB 20.60% 46.84 TiB 34.94%
f01919535 Wuhan, Hubei, CN 65.72 TiB 18.80% 27.28 TiB 58.49%
f01851482 Busan, Busan, KR 59.84 TiB 17.12% 35.22 TiB 41.15%
f01852023 Busan, Busan, KR 59.19 TiB 16.93% 35.78 TiB 39.55%
f01852325 Hong Kong, Central and Western, HK 36.53 TiB 10.45% 20.31 TiB 44.40%
f01852664 Singapore, Singapore, SG 34.16 TiB 9.77% 19.38 TiB 43.28%
f01852677 Morrisville, North Carolina, US 22.16 TiB 6.34% 13.00 TiB 41.33%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

⚠️ 100.00% of deals are for data replicated across less than 4 storage providers.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
132.94 TiB 243.00 TiB 1 69.51%
31.50 TiB 103.75 TiB 2 29.68%
640.00 GiB 2.84 TiB 3 0.81%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

⚠️ CID sharing has been observed.

Other Client Application Total Deals Affected Unique CIDs Verifier
f17tvb3ejs3ev6owqmkpzomtewgrd6v2sofv7upma Beijing Lexun Technology Co., LTD 128.66 TiB 1,446 LDN v3 multisig
f3w4wlayytfmsay6gu5phhij5r4yyx7t4xxrlosgo
tlmqg5eih3co5atsra4h2pe4qd2d6c76bvhj6nwim
7lgq
GMWBR INC 81.22 TiB 1,422 LDN v3 multisig
f3woqxpu6ekmj43nmpcv7j2pgu6lejxtzgxpzl6f2
vrueoqlzjntakyhdkghymyffbzfbsio6dvfmy643x
4y7q
RICH ST PETE LLC 48.03 TiB 736 LDN v3 multisig
f1cuboogcwais57dljrpeltoy6ja2itb7wvwmrl3q Penglaiju 38.94 TiB 599 LDN v3 multisig
f1piaz4nodpwdemrfqak5jlfg5ois2onzmjv6fkki Chengdu Yundianshang Technology Co., LTD 34.00 TiB 610 LDN v3 multisig
f14uxcyaoab3qhn42kaquqysga6f6zfry3x4nk3ca China Tianying INC. 32.47 TiB 641 LDN v3 multisig
f3uils5cdx3ezyzszjjfnulugknbdsanmqtisd7x7
xkfcljdnshp4jspnrgxpldt5b4aafuz4q4rkebpjy
keha
Qingyun Education Fund 19.34 TiB 281 LDN v3 multisig
f1pediuk4kncwp4qxawlope7hzfmd2ran35w54o7y Qistone Information technology 5.88 TiB 21 LDN v3 multisig
f14nyld75bnvr2y3ca4ew7vxmwp4tuwytqwggthcy Kimoc 4.41 TiB 104 LDN v3 multisig
f3qvn7f5u4z5w5pqx3htckp4jcn5dvgmebq6qkqcz
xdxkicwokf75tt5hbwtrjwldz2sjiyq752ajcn3nd
5tgq
Beijing Haishi Hengtong Technology Co., LTD 4.13 TiB 86 LDN v3 multisig
f3q7ablez3jqkcjukwbzaql7lmbx4ldouu66nexpd
cfvu6kgho3v6gricckt77cgr46tdre2l4zmvha7bs
u7qq
MatrixStorage 416.00 GiB 9 LDN # 72
f3xffjctbyy7zigopfa3za5ha3pvv4z3xfghlw7kw
vyeuabkg4lzsgfwhnghwkvmmvi6yso6k52hq3ca6c
kveq
Chengdu Digital Media Industry Base Co., Ltd. 64.00 GiB 2 LDN v3 multisig

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

pwrepo commented 1 year ago

Is this complete?

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!