filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application] Kuaixue Education - Dataset3(3/3) #1363

Closed Acc5com closed 1 year ago

Acc5com commented 1 year ago

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

Accounting school is an online training platform under Shenzhen kuaixue Education Software Technology Co., Ltd. Founded in 2014, it is committed to providing high-quality accounting practice, tax practice, CPA, intermediate accounting, primary accounting and other training for the majority of accounting practitioners. Since its establishment, kuaixue education has a team of excellent teachers with more than 500 people, and has developed practical courses covering 150 + industries and 10000 + class hours. It is a leading online accounting practical training institution in China. Located in Shenzhen Nanshan Science and Technology Park, it is a technology and product driven innovation education training institution with more than 3million online students.

What is the primary source of funding for this project?

From ourselves.

What other projects/ecosystem stakeholders is this project associated with?

Only related to ourselves.

Use-case details

Describe the data being stored onto Filecoin

Part of the course video of our platform.

Where was the data in this dataset sourced from?

Courses data
Real account practice: real account practice in more than 150 industries such as industry, commerce and trade, hotel and catering, construction and real estate, software, finance, import and export, administrative units, public institutions, etc;
Introduction to zero Foundation: basic accounting, financial regulations, computerization, career planning, introduction to accounting, introduction to tax law, cashier, entry, voucher binding, office software
I want to learn cashier: introduction to theory and practical operation;
Manual true account: Bookkeeping vouchers, journals, sub ledgers, general ledger, account summary, and financial statements;
Tax practice: introduction to theory, replacing business tax with value-added tax, income tax, national tax and local tax declaration, tax accounting, tax planning, export tax rebate, Golden Tax phase III, invoice management, etc;
Financial management: financial analysis, financial management, bank loans, CFO;
Professional title examination: junior accountant, intermediate accountant, certified public accountant;
Financial software: Kingdee, UFIDA, Suda, Kuai Zhang, general ledger
Excel School: Excel foundation, function formula, pivot table, accounting application, financial modeling, practical operation explanation, VBA, chart, accounting E-form, etc;

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

Yes, we have a large number of open free training courses.
In the future, after filecoin plus enterprises can open applications, we will store more paid courses on filecoin.

https://www.acc5.com/course/course_13335/learn/lesson_74929/
https://www.acc5.com/course/course_13545/learn/lesson_77063/
https://www.acc5.com/course/course_12656/learn/lesson_66671/
https://www.acc5.com/course/course_11017/learn/lesson_43899/
https://www.acc5.com/practice/course/cat_0/list-0-3-1000-0/1.html

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

Yes, there are no restrictions. Anyone who wants to learn accounting knowledge can search it.

What is the expected retrieval frequency for this data?

At present, it is only used as a backup until our private cloud data is lost.

For how long do you plan to keep this dataset stored on Filecoin?

We hope it will last forever.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

we hope in China, Singapore, Vietnam, Australia

SP we have worked with
--
f01128206
f01859603
f01901765
f01901351
f01917009
f01915287
f01660795
f01923786
f01923787
f01885088
f01901379
f01870888
f01412203
f01909429
f01907545
f01922865
f01937995
f01945688
f01922893
f01949183
f01929568
f01954294
f01901773
f01940076

How will you be distributing your data to storage providers? Is there an offline data transfer process?

These data are currently stored in our enterprise private cloud. Because the amount of data is very huge, we hope to transmit offline.

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

Because we are not miners, we only know filecoin after being introduced by friends. When we learned about fil plus, we were very happy. We carefully read the storage rules of filecoin plus.
first ,We guarantee that the distribution of content will be as decentralized as possible. It will be stored in at least 8 storage service provider nodes.
The more decentralized the data storage is, the more secure it is for our data.
secondly,
We will ask the service provider that stores the data to provide the function of rapid retrieval, and will review the work after the storage is completed.

How will you be distributing deals across storage providers?

Choose Offline transmission as far as possible.
Our enterprise private cloud should support normal users. If it is used for online transmission, it will occupy too many resources.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Yes, we have enough money. However, we still don't know how to cut the document into a unified format according to the regulations, and we will continue to learn.
large-datacap-requests[bot] commented 1 year ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

Acc5com commented 1 year ago

https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/512 https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1117

simonkim0515 commented 1 year ago

Datacap Request Trigger

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

500TiB

Client address

f1t3buz7oqz4fktpthqe43vauhzlnuztpgm3iyhbi

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f1t3buz7oqz4fktpthqe43vauhzlnuztpgm3iyhbi

DataCap allocation requested

250TiB

Id

090f7122-532a-441b-8270-a52b79e14b28

xiaoyuaiheshui commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceccd65vrbptijn5zlfwzzaz53ijuudn7t7673zypvrwgesrkjqiaw

Address

f1t3buz7oqz4fktpthqe43vauhzlnuztpgm3iyhbi

Datacap Allocated

250.00TiB

Signer Address

f122qmy25wdtt5mxd77kndiq7z5x2n3iwiuz2wdsa

Id

090f7122-532a-441b-8270-a52b79e14b28

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceccd65vrbptijn5zlfwzzaz53ijuudn7t7673zypvrwgesrkjqiaw

1ane-1 commented 1 year ago

I checked the historical records and they are all in line with the regulations

1ane-1 commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzaceb6eyvr7p5vrwrjbm4v26bkx3kgusunfqkxbskkvbhkoyip3rulvo

Address

f1t3buz7oqz4fktpthqe43vauhzlnuztpgm3iyhbi

Datacap Allocated

250.00TiB

Signer Address

f1mdk7s2vntzm6hu35yuo6vjubtrpfnb2awhgvrri

Id

090f7122-532a-441b-8270-a52b79e14b28

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceb6eyvr7p5vrwrjbm4v26bkx3kgusunfqkxbskkvbhkoyip3rulvo

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 2

Multisig Notary address

f02049625

Client address

f1t3buz7oqz4fktpthqe43vauhzlnuztpgm3iyhbi

DataCap allocation requested

500TiB

Id

ac960305-4c0c-41ca-b957-f3e9db15e5e8

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f1t3buz7oqz4fktpthqe43vauhzlnuztpgm3iyhbi

Last two approvers

1ane-1 & jggapp

Rule to calculate the allocation request amount

100% of weekly dc amount requested

DataCap allocation requested

500TiB

Total DataCap granted for client so far

4.68PiB

Datacap to be granted to reach the total amount requested by the client (5PiB)

325.12TiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
156157 16 250TiB 18.91 5.11GiB
kernelogic commented 1 year ago
image

Looks pretty good in the sense of distribution. Willing to support.

kernelogic commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacebblg2oekti5o7jcutaqtfhbihc4brdvdvpqdb6523kuzvjnafsia

Address

f1t3buz7oqz4fktpthqe43vauhzlnuztpgm3iyhbi

Datacap Allocated

500.00TiB

Signer Address

f1yjhnsoga2ccnepb7t3p3ov5fzom3syhsuinxexa

Id

ac960305-4c0c-41ca-b957-f3e9db15e5e8

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebblg2oekti5o7jcutaqtfhbihc4brdvdvpqdb6523kuzvjnafsia

stcloudlisa commented 1 year ago

I saw the request from client,and I see the distribution is so good,so, I approve it

stcloudlisa commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacecjvjbh5lvqgvizzobhp55tu4po7a66cd3skvi265qo2elsptrpde

Address

f1t3buz7oqz4fktpthqe43vauhzlnuztpgm3iyhbi

Datacap Allocated

500.00TiB

Signer Address

f1jvvltduw35u6inn5tr4nfualyd42bh3vjtylgci

Id

ac960305-4c0c-41ca-b957-f3e9db15e5e8

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecjvjbh5lvqgvizzobhp55tu4po7a66cd3skvi265qo2elsptrpde

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 3

Multisig Notary address

f02049625

Client address

f1t3buz7oqz4fktpthqe43vauhzlnuztpgm3iyhbi

DataCap allocation requested

1000.0TiB

Id

e8eadda0-668e-43f4-a3ed-9191417be182

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f1t3buz7oqz4fktpthqe43vauhzlnuztpgm3iyhbi

Last two approvers

1LISA2 & kernelogic

Rule to calculate the allocation request amount

200% of weekly dc amount requested

DataCap allocation requested

1000.0TiB

Total DataCap granted for client so far

4.92PiB

Datacap to be granted to reach the total amount requested by the client (5PiB)

75.12TiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
163624 16 500TiB 18.05 5.11GiB
psh0691 commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacearledtgd4kcyscbej2jxxqvkr2lgamn2zp4t57giq3ehwexrpyog

Address

f1t3buz7oqz4fktpthqe43vauhzlnuztpgm3iyhbi

Datacap Allocated

1000.00TiB

Signer Address

f1qdko4jg25vo35qmyvcrw4ak4fmuu3f5rif2kc7i

Id

e8eadda0-668e-43f4-a3ed-9191417be182

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacearledtgd4kcyscbej2jxxqvkr2lgamn2zp4t57giq3ehwexrpyog

filplus-checker commented 1 year ago

DataCap and CID Checker Report[^1]

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

⚠️ 66.60% of total deal sealed by f01922893 are duplicate data.

⚠️ 73.78% of total deal sealed by f01922865 are duplicate data.

⚠️ 72.99% of total deal sealed by f01915287 are duplicate data.

⚠️ 79.64% of total deal sealed by f01901773 are duplicate data.

⚠️ 68.68% of total deal sealed by f01945688 are duplicate data.

⚠️ 79.47% of total deal sealed by f01970674 are duplicate data.

⚠️ 37.24% of total deal sealed by f01972356 are duplicate data.

⚠️ 32.45% of total deal sealed by f01955186 are duplicate data.

⚠️ 64.16% of total deal sealed by f01949183 are duplicate data.

⚠️ 75.74% of total deal sealed by f01970213 are duplicate data.

⚠️ 49.33% of total deal sealed by f01929568 are duplicate data.

⚠️ 78.23% of total deal sealed by f01967672 are duplicate data.

⚠️ 20.82% of total deal sealed by f01901351 are duplicate data.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f01922893new Hanoi, Hanoi, VN 860.88 TiB 16.64% 287.53 TiB 66.60%
f01922865 Ho Chi Minh City, Ho Chi Minh, VN 806.28 TiB 15.59% 211.38 TiB 73.78%
f01915287 Hanoi, Hanoi, VN 788.25 TiB 15.24% 212.88 TiB 72.99%
f01901773 Hong Kong, Central and Western, HK 759.00 TiB 14.67% 154.50 TiB 79.64%
f01945688new Wuxi, Jiangsu, CN 495.69 TiB 9.58% 155.25 TiB 68.68%
f01970674 Sydney, New South Wales, AU 260.91 TiB 5.04% 53.56 TiB 79.47%
f01972356 Maywood Park, Oregon, US 246.69 TiB 4.77% 154.81 TiB 37.24%
f01955186 Hong Kong, Central and Western, HK 229.09 TiB 4.43% 154.75 TiB 32.45%
f01949183 Maywood Park, Oregon, US 174.38 TiB 3.37% 62.50 TiB 64.16%
f01970213 Hong Kong, Central and Western, HK 128.81 TiB 2.49% 31.25 TiB 75.74%
f01929568 Stockholm, Stockholm, SE 123.16 TiB 2.38% 62.41 TiB 49.33%
f01870888 Chengdu, Sichuan, CN 94.38 TiB 1.82% 81.88 TiB 13.25%
f01954294new Shenzhen, Guangdong, CN 79.75 TiB 1.54% 79.75 TiB 0.00%
f01967672 Hong Kong, Central and Western, HK 71.78 TiB 1.39% 15.63 TiB 78.23%
f01940076 Hong Kong, Central and Western, HK 34.97 TiB 0.68% 34.97 TiB 0.00%
f01901351 Chengdu, Sichuan, CN 18.31 TiB 0.35% 14.50 TiB 20.82%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

⚠️ 48.45% of deals are for data replicated across less than 4 storage providers.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
733.94 TiB 1.80 PiB 1 35.60%
133.50 TiB 659.44 TiB 2 12.75%
448.00 GiB 5.09 TiB 3 0.10%
35.31 TiB 569.25 TiB 4 11.01%
93.34 TiB 1.55 PiB 5 30.72%
26.22 TiB 508.22 TiB 6 9.83%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

⚠️ CID sharing has been observed.

Other Client Application Total Deals Affected Unique CIDs Verifier
f16ioghg3qy36f6572viouwv4dqow5ejpolo4kodi Shenzhen kuaixue Education Development Co., Ltd 5.06 PiB 12,627 LDN v3 multisig
f1tb2hrxk5eaeewcesnid6xmvfkklfdrxsjr5k6iy Yisainuo 603.31 TiB 6,770 LDN v3 multisig
f1unz5yuxgui573lpuh2wyxpsx5ahvw5farqb7hji AOLIGEI 300.22 TiB 6,870 LDN v3 multisig
f1axtj4vjlsf5buf2qcn4kkqrpdcgrqk4exxxsury CIIC Education Technology (Shenzhen) Co., Ltd. 148.53 TiB 470 LDN v3 multisig
f3qebbkqspq4w6deouaubtngt4bmaada76uqs3omy
3tki6hoeocpgxyplknev5u3oi5e7xnltobrvgxnpa
3qga
codex8080 - Slingshot Restore 50.47 TiB 1,615 LDN v3 multisig
f126k3tkdwfaqpflgcclkiwhqxhh73ebqqazwgcoy New Web Group 29.53 TiB 943 LDN v3 multisig
f3v7x4a2aapgx6o2r477tenoin3u5oadaeqyd7kjd
sitykvf4ok7vq2utcyh34lmu5u7oybs25ff6s4dbu
dpma
LeoCheung - Slingshot Restore 22.97 TiB 735 LDN v3 multisig
f1m6bvbcawgrxcvyk2tzayt5mplgqs255swwnw6dq Shanghai Tangzhi Technology Development Co., Ltd. 14.97 TiB 468 LDN v3 multisig
f1x7wsqpj6waymzzfqmu4hh32tyc4pbbqnpwy2ucq Glif auto verified 32.00 GiB 1 Jonathan Schwartz
f15m6qoqdsh7fn7l5amegshhvwi4gl5alb2eeuz2y Worldkan 32.00 GiB 1 LDN v3 multisig

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

herrehesse commented 1 year ago

@Acc5com All of your datacap requests are fraudulent with PiB's of duplicates. No notary should sign this.

⚠️ 66.60% of total deal sealed by f01922893 are duplicate data. ⚠️ 73.78% of total deal sealed by f01922865 are duplicate data. ⚠️ 72.99% of total deal sealed by f01915287 are duplicate data. ⚠️ 79.64% of total deal sealed by f01901773 are duplicate data. ⚠️ 68.68% of total deal sealed by f01945688 are duplicate data. ⚠️ 79.47% of total deal sealed by f01970674 are duplicate data. ⚠️ 37.24% of total deal sealed by f01972356 are duplicate data. ⚠️ 32.45% of total deal sealed by f01955186 are duplicate data. ⚠️ 64.16% of total deal sealed by f01949183 are duplicate data. ⚠️ 75.74% of total deal sealed by f01970213 are duplicate data. ⚠️ 49.33% of total deal sealed by f01929568 are duplicate data. ⚠️ 78.23% of total deal sealed by f01967672 are duplicate data. ⚠️ 20.82% of total deal sealed by f01901351 are duplicate data.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!

aggregation-and-compliance-bot[bot] commented 11 months ago
Client f01943532 does not follow the datacap usage rules. More info here. This application has been failing the requirements for 7 days. Please take appropiate action to fix the following DataCap usage problems. Criteria Treshold Reason
Cid Checker score > 25% The client has a CID checker score of 5%. This should be greater than 25%. To find out more about CID checker score please look at this issue: https://github.com/filecoin-project/notary-governance/issues/986
Shared data percent < 20% 59.96% of the clients data is shared with other clients. This should be less than 20%