filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application] <Ji Xia Technology Hong Kong Limited> - <CoinSummer Video> #1008

Closed maxvint closed 1 year ago

maxvint commented 1 year ago

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

CoinSummer Video provides cryptocurrency industry analysis, regulatory, trend prediction, trading strategy, fundamental analysis, industry technical analysis video, real-time tracking of the latest developments of crypto currency.

What is the primary source of funding for this project?

Own funds and revenue of the company.

What other projects/ecosystem stakeholders is this project associated with?

None

Use-case details

Describe the data being stored onto Filecoin

The data stored on Filecoin is videos and pictures of our crypto videos. We have translated over 5PiB of crypto videos since 2019 which cover cryptocurrency market insights, crypto market observation, bitcoin analysis, trading strategies, crypto stories etc.

Where was the data in this dataset sourced from?

Mainly from our strategic parters.

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

https://drive.google.com/drive/folders/1AO7tBpfVf8xERDKGSTkGYRf5azWAhlyQ?usp=sharing

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

Yes

What is the expected retrieval frequency for this data?

3 - 5 times every 540 days.

For how long do you plan to keep this dataset stored on Filecoin?

Store for at least 540 days.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

North America and Asia.

How will you be distributing your data to storage providers? Is there an offline data transfer process?

We will send CAR files to storage providers offline for deal making.

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

We will check their overall capacity and working hours, we need a storage provider that we can work with for a long time.

How will you be distributing deals across storage providers?

We will allocate DataCap according to miner's preference, such as online transfer and offline transfer.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Yes,we have enough funding and resources to start making deals.
large-datacap-requests[bot] commented 1 year ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 1 year ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

raghavrmadya commented 1 year ago

https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/650

You already have this open. What is your relationship with each organization?

maxvint commented 1 year ago

650

You already have this open. What is your relationship with each organization?

I am the tech leader of both Ji xia Technology and Xuetianxia Technology. I am the tech leader of Xuetianxia since 2016, responsible for the company's technical architecture and storage service platform development. At the same time, from 2019, I have also served as the tech leader of Ji xia Technology, mainly focus on Filecoin related business. As a Filecoin miner, i have participated in Testnet, Space Race, Slingshot, Mainnet. And as a Filecoin builder, i am the founder and core maintainer of the Filecoin Community China: https://github.com/filecoin-project/community-china on behalf of Ji xia Technology.

raghavrmadya commented 1 year ago

Thanks for clarifying.

raghavrmadya commented 1 year ago

Datacap Request Trigger

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

100TiB

Client address

f1rw54uifx5lhmprpu5ovcxilc56j2pqm5w336djy

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f1rw54uifx5lhmprpu5ovcxilc56j2pqm5w336djy

DataCap allocation requested

50TiB

Joss-Hua commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceddykc5hm3fivfyvvehvseadspsflzlno7m6vngxqefwx6zpobemq

Address

f1rw54uifx5lhmprpu5ovcxilc56j2pqm5w336djy

Datacap Allocated

50.00TiB

Signer Address

f1tfg54zzscugttejv336vivknmsnzzmyudp3t7wi

Id

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceddykc5hm3fivfyvvehvseadspsflzlno7m6vngxqefwx6zpobemq

psh0691 commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacecfyoqrls333dwwc4cut2lh2a4dlm5oxya3e5mijof57o56ov472i

Address

f1rw54uifx5lhmprpu5ovcxilc56j2pqm5w336djy

Datacap Allocated

50.00TiB

Signer Address

f1qdko4jg25vo35qmyvcrw4ak4fmuu3f5rif2kc7i

Id

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecfyoqrls333dwwc4cut2lh2a4dlm5oxya3e5mijof57o56ov472i

BDE-io commented 1 year ago

@yuwenhui Hi! Great to see you have gotten approval for DataCap. If you are looking for storage providers to store these data or have any questions, please visit #bigdata-exchange on Filecoin Slack or reply here.

We have strong demand from a diverse group of SPs, who are actively looking to onboard more data.

maxvint commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

Storage Provider Distribution

The below table shows the distribution of storage providers that have stored data for this client.

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

Since this is the 3rd allocation, the following restrictions have been relaxed:

✔️ Storage provider distribution looks healthy.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f02006691new Hong Kong, Central and Western, HK
China Unicom Global
11.94 TiB 43.51% 11.94 TiB 0.00%
f01991416new Hong Kong, Central and Western, HK
China Unicom Global
11.88 TiB 43.28% 11.88 TiB 0.00%
f02012032 Hong Kong, Central and Western, HK
China Unicom Global
2.63 TiB 9.57% 2.63 TiB 0.00%
f01824405 Hangzhou, Zhejiang, CN
Sichuan Chuanxn IDC
1.00 TiB 3.64% 1.00 TiB 0.00%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

Since this is the 3rd allocation, the following restrictions have been relaxed:

⚠️ 100.00% of deals are for data replicated across less than 3 storage providers.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
27.31 TiB 27.31 TiB 1 99.54%
64.00 GiB 128.00 GiB 2 0.46%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

However, this could be possible if all below clients use same software to prepare for the exact same dataset or they belong to a series of LDN applications for the same dataset.

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 2

Multisig Notary address

f02049625

Client address

f1rw54uifx5lhmprpu5ovcxilc56j2pqm5w336djy

DataCap allocation requested

100TiB

Id

0c8843db-26d6-4bfd-b5d6-ef334ca10304

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f1rw54uifx5lhmprpu5ovcxilc56j2pqm5w336djy

Last two approvers

psh0691 & Joss-Hua

Rule to calculate the allocation request amount

100% of weekly dc amount requested

DataCap allocation requested

100TiB

Total DataCap granted for client so far

50TiB

Datacap to be granted to reach the total amount requested by the client (5PiB)

4.95PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
1200 5 50TiB 32.07 12.32TiB
filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

Storage Provider Distribution

The below table shows the distribution of storage providers that have stored data for this client.

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

Since this is the 3rd allocation, the following restrictions have been relaxed:

✔️ Storage provider distribution looks healthy.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f02006691new Hong Kong, Central and Western, HK
China Unicom Global
11.94 TiB 32.51% 11.94 TiB 0.00%
f02012032 Hong Kong, Central and Western, HK
China Unicom Global
11.91 TiB 32.43% 11.91 TiB 0.00%
f01991416new Hong Kong, Central and Western, HK
China Unicom Global
11.88 TiB 32.34% 11.88 TiB 0.00%
f01824405 Hangzhou, Zhejiang, CN
Sichuan Chuanxn IDC
1.00 TiB 2.72% 1.00 TiB 0.00%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

Since this is the 3rd allocation, the following restrictions have been relaxed:

⚠️ 100.00% of deals are for data replicated across less than 3 storage providers.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
36.47 TiB 36.47 TiB 1 99.32%
128.00 GiB 256.00 GiB 2 0.68%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients. Usually different applications owns different data and should not resolve to the same CID.

However, this could be possible if all below clients use same software to prepare for the exact same dataset or they belong to a series of LDN applications for the same dataset.[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

maxvint commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 3 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

cryptowhizzard commented 1 year ago

Hi,

I would love to move forward with this application but retrieval is not working.

Feb 15 16:40:04 proposals dealscanner-f01925027-f02014107: Error: Failed to retrieve content with candidate miner f02014107: data transfer failed: unconfirmed block transfer

maxvint commented 1 year ago

Hi,

I would love to move forward with this application but retrieval is not working.

Feb 15 16:40:04 proposals dealscanner-f01925027-f02014107: Error: Failed to retrieve content with candidate miner f02014107: data transfer failed: unconfirmed block transfer

Hello @cryptowhizzard

Thanks for your check on our LDN application. The retrieval is working very well now. Please take a look.

lotus state get-deal 25177285
lotus client retrieve --provider f02014107 bafybeicdjc5t244dc3lonj6xazc3g3cqijgvculghzu7fsswomhgiqmipe ~/data/bafybeicdjc5t244dc3lonj6xazc3g3cqijgvculghzu7fsswomhgiqmipe.tar
Recv 0 B, Paid 0 FIL, Open (New), 1ms [1673063481417153285|0]
Recv 0 B, Paid 0 FIL, DealProposed (WaitForAcceptance), 22ms [1673063481417153285|0]
Recv 0 B, Paid 0 FIL, DealAccepted (Accepted), 154ms [1673063481417153285|0]
Recv 0 B, Paid 0 FIL, PaymentChannelSkip (Ongoing), 166ms [1673063481417153285|0]
Recv 8.623 KiB, Paid 0 FIL, BlocksReceived (Ongoing), 56.769s [1673063481417153285|8830]
Recv 20.94 KiB, Paid 0 FIL, BlocksReceived (Ongoing), 56.866s [1673063481417153285|21440]
Recv 1.02 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 59.89s [1673063481417153285|1070016]
Recv 2.02 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 1m2.187s [1673063481417153285|2118592]
Recv 3.02 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 1m4.533s [1673063481417153285|3167168]
Recv 4.02 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 1m6.898s [1673063481417153285|4215744]
Recv 5.02 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 1m9.032s [1673063481417153285|5264320]
Recv 6.02 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 1m10.953s [1673063481417153285|6312896]
Recv 7.02 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 1m13.247s [1673063481417153285|7361472]
Recv 8.02 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 1m15.808s [1673063481417153285|8410048]
Recv 9.02 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 1m17.931s [1673063481417153285|9458624]
Recv 10.02 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 1m20.26s [1673063481417153285|10507200]
Recv 11.02 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 1m22.463s [1673063481417153285|11555776]
Recv 12.02 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 1m24.403s [1673063481417153285|12604352]
Recv 13.02 MiB, Paid 0 FIL, BlocksReceived (Ongoing), 1m27.106s [1673063481417153285|13652928]
cryptowhizzard commented 1 year ago

It seems you are storing duplicate files.

jdupes f01925027-f02012032-22225260-baga6ea4seaqlwb5kk36apyu7v4yvaaczazshoobb2nrqmuahhilx7hvqt2cxmoi.car1/ Scanning: 205 files, 1 items (in 1 specified) f01925027-f02012032-22225260-baga6ea4seaqlwb5kk36apyu7v4yvaaczazshoobb2nrqmuahhilx7hvqt2cxmoi.car1/4bacd1d0b80d46b9a4ebfd52c3acf8f8.mp4 f01925027-f02012032-22225260-baga6ea4seaqlwb5kk36apyu7v4yvaaczazshoobb2nrqmuahhilx7hvqt2cxmoi.car1/8859970218254a6fa04e4817bc3778a8.mp4

f01925027-f02012032-22225260-baga6ea4seaqlwb5kk36apyu7v4yvaaczazshoobb2nrqmuahhilx7hvqt2cxmoi.car1/98311301d8554be89dbc4730adae3152.mp4 f01925027-f02012032-22225260-baga6ea4seaqlwb5kk36apyu7v4yvaaczazshoobb2nrqmuahhilx7hvqt2cxmoi.car1/5785794995e44361bb60ba1bd4f3f5de.mp4

f01925027-f02012032-22225260-baga6ea4seaqlwb5kk36apyu7v4yvaaczazshoobb2nrqmuahhilx7hvqt2cxmoi.car1/7572f49742f04ae9ade4ae8b3394f264.mp4 f01925027-f02012032-22225260-baga6ea4seaqlwb5kk36apyu7v4yvaaczazshoobb2nrqmuahhilx7hvqt2cxmoi.car1/fa24a5e4fa89478097b7ec361588c39f.mp4

Please explain?

maxvint commented 1 year ago

It seems you are storing duplicate files.

jdupes f01925027-f02012032-22225260-baga6ea4seaqlwb5kk36apyu7v4yvaaczazshoobb2nrqmuahhilx7hvqt2cxmoi.car1/ Scanning: 205 files, 1 items (in 1 specified) f01925027-f02012032-22225260-baga6ea4seaqlwb5kk36apyu7v4yvaaczazshoobb2nrqmuahhilx7hvqt2cxmoi.car1/4bacd1d0b80d46b9a4ebfd52c3acf8f8.mp4 f01925027-f02012032-22225260-baga6ea4seaqlwb5kk36apyu7v4yvaaczazshoobb2nrqmuahhilx7hvqt2cxmoi.car1/8859970218254a6fa04e4817bc3778a8.mp4

f01925027-f02012032-22225260-baga6ea4seaqlwb5kk36apyu7v4yvaaczazshoobb2nrqmuahhilx7hvqt2cxmoi.car1/98311301d8554be89dbc4730adae3152.mp4 f01925027-f02012032-22225260-baga6ea4seaqlwb5kk36apyu7v4yvaaczazshoobb2nrqmuahhilx7hvqt2cxmoi.car1/5785794995e44361bb60ba1bd4f3f5de.mp4

f01925027-f02012032-22225260-baga6ea4seaqlwb5kk36apyu7v4yvaaczazshoobb2nrqmuahhilx7hvqt2cxmoi.car1/7572f49742f04ae9ade4ae8b3394f264.mp4 f01925027-f02012032-22225260-baga6ea4seaqlwb5kk36apyu7v4yvaaczazshoobb2nrqmuahhilx7hvqt2cxmoi.car1/fa24a5e4fa89478097b7ec361588c39f.mp4

Please explain?

Hi, thanks for your check on our LDN application. After carefully checking, we found that there is a very small amount of duplicate data in the original data we provided to our SPs, so we have recently de-duplicated all the data to ensure that this will not happen again. We also hope the community continue providing supervision.

cryptowhizzard commented 1 year ago

Can you elaborate how much the very small amount is? I have tried multiple retrievals and every retrieval has the same duplicate files?

maxvint commented 1 year ago

Can you elaborate how much the very small amount is? I have tried multiple retrievals and every retrieval has the same duplicate files?

@cryptowhizzard About 3% of the total data, as we have de-duplicated all the data, so i am sorry i can not give a accurate amount.

maxvint commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 3 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

maxvint commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 3 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!

data-programs commented 1 year ago
KYC

This user’s identity has been verified through filplus.storage

large-datacap-requests[bot] commented 9 months ago

Thanks for your request! :exclamation: We have found some problems in the information provided. We could not find Organization Name field in the information provided We could not find Website \/ Social Media field in the information provided We could not find Total amount of DataCap being requested (between 500 TiB and 5 PiB) field in the information provided We could not find Weekly allocation of DataCap requested (usually between 1-100TiB) field in the information provided We could not find On-chain address for first allocation field in the information provided We could not find Data Type of Application field in the information provided

Please, take a look at the request and edit the body of the issue providing all the required information.