filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
109 stars 62 forks source link

[DataCap Application] <Zetacube Inc> - <Common Crawl February 2020> #1991

Closed hooonc closed 12 months ago

hooonc commented 1 year ago

Data Owner Name

Common Craw

What is your role related to the dataset

Data Preparer

Data Owner Country/Region

United States

Data Owner Industry

Other

Website

https://commoncrawl.org/2020/03/february-2020-crawl-archive-now-available/

Social Media

none

Total amount of DataCap being requested

1PiB

Weekly allocation of DataCap requested

100TiB

On-chain address for first allocation

f1pkpa3h4suh77mi3guur25r5u4twkere5r35fjii

Data Type of Application

Slingshot

Custom multisig

Share a brief history of your project and organization

Zetacube Inc is a 'Ultra-high-capacity decentralized storage hyper server solution company' in the Web 3.0 era, established to research and commercialize the IPFS storage system, an innovative technology for safely storing huge data in the 4th industrial age.
Also, Zetacube Inc is participant in the Slingshot V3, both as DP and SP.

Is this project associated with other projects/ecosystem stakeholders?

No

If answered yes, what are the other projects/ecosystem stakeholders

No response

Describe the data being stored onto Filecoin

The crawl archive for February 2020 is now available! It contains 2.6 billion web pages or 240 TiB of uncompressed content, crawled between February 16th and 29th. It includes page captures of 1 billion URLs unknown in any of our prior crawl archives.

Where was the data currently stored in this dataset sourced from

AWS Cloud

If you answered "Other" in the previous question, enter the details here

No response

How do you plan to prepare the dataset

singularity

If you answered "other/custom tool" in the previous question, enter the details here

No response

Please share a sample of the data

https://commoncrawl.org/2020/03/february-2020-crawl-archive-now-available/

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

Monthly

For how long do you plan to keep this dataset stored on Filecoin

2 to 3 years

In which geographies do you plan on making storage deals

Greater China, Asia other than Greater China, Africa, North America, South America, Europe, Australia (continent)

How will you be distributing your data to storage providers

HTTP or FTP server, IPFS, Shipping hard drives

How do you plan to choose storage providers

Slack, Big Data Exchange, Partners

If you answered "Others" in the previous question, what is the tool or platform you plan to use

No response

If you already have a list of storage providers to work with, fill out their names and provider IDs below

f0838467 - IPFSKDC-Incheon
f01227505 - IPFSKDC-Incheon
f01942239 - IPFSKDC-Incheon
f01695888 - IPFSKDC-Gwangju
f01823110 - IPFSKDC-Jeonju

How do you plan to make deals to your storage providers

Boost client

If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline

Yes

large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

Sunnyiscoming commented 1 year ago

Do you have website or social media? Which are your nodes?

hooonc commented 1 year ago

@Sunnyiscoming Hi, my compay website is https://www.zetacube.net/ Our node is f01103160. It is not included in the SP list

Sunnyiscoming commented 1 year ago

Could you send an email to filplus-app-review@fil.org with your official domain in order to confirm your identity? Email name should includes the issue id #1991.

hooonc commented 1 year ago

@Sunnyiscoming I just emailed to filplus-app-review@fil.org

Sunnyiscoming commented 1 year ago

image There is just no more than 100 TB data.

Sunnyiscoming commented 1 year ago

https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/2003

hooonc commented 1 year ago

image There is just no more than 100 TB data.

This dataset is what we claimed on Slingshot. Caro said DP should file LDN applications to the claimed dataset to require Data cap.

On Slingshot, you can see that a large number of data sets do not exceed 100 TiB.

hooonc commented 1 year ago

2003

스크린샷 2023-05-21 060421

My team claimed three datasets on Slingshot, and #2003 is one of them.

Sunnyiscoming commented 1 year ago

Total amount of DataCap being requested 3PiB

This is not reasonable. Please check it again.

hooonc commented 1 year ago

Total amount of DataCap being requested 3PiB

This is not reasonable. Please check it again.

I changed the total amount of DataCap requested to 1PiB. Thank you.

Sunnyiscoming commented 1 year ago

Datacap Request Trigger

Total DataCap requested

1PiB

Expected weekly DataCap usage rate

100TiB

Client address

f1pkpa3h4suh77mi3guur25r5u4twkere5r35fjii

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f1pkpa3h4suh77mi3guur25r5u4twkere5r35fjii

DataCap allocation requested

50TiB

Id

d0eaed85-f595-4d10-ac20-48b857adb5ca

GaryGJG commented 1 year ago

Support this little beginning.

GaryGJG commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacebm3blaiy3e3h6u5eoccmbzwugo7dym4blmo7ko4yrhwob6wpzs4u

Address

f1pkpa3h4suh77mi3guur25r5u4twkere5r35fjii

Datacap Allocated

50.00TiB

Signer Address

f1zffqhxwq2rrg7rtot6lmkl6hb2xyrrseawprzsq

Id

d0eaed85-f595-4d10-ac20-48b857adb5ca

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebm3blaiy3e3h6u5eoccmbzwugo7dym4blmo7ko4yrhwob6wpzs4u

hooonc commented 1 year ago

@GaryGJG I appreciate your assistance.

laurarenpanda commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacecbrznmf4kgsxilliqjcze4tjrrcgvxzls3tvriga4e7f3xhfpob4

Address

f1pkpa3h4suh77mi3guur25r5u4twkere5r35fjii

Datacap Allocated

50.00TiB

Signer Address

f1bp3tzp536edm7dodldceekzbsx7zcy7hdfg6uzq

Id

d0eaed85-f595-4d10-ac20-48b857adb5ca

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecbrznmf4kgsxilliqjcze4tjrrcgvxzls3tvriga4e7f3xhfpob4

Sunnyiscoming commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

No active deals found for this client.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

hooonc commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

No active deals found for this client.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

hooonc commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

No active deals found for this client.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

psh0691 commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

⚠️ All retrieval success ratios are below 1%.

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 90% of total datacap - f01227505: 100.00%

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 2 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval report.

cryptowhizzard commented 1 year ago

@HoonChoi-Zetacube

Why did you allocate 100% of the datacap to your own miners?

What part of the FIL+ rules did you not understand?

psh0691 commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

⚠️ All retrieval success ratios are below 1%.

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 90% of total datacap - f01227505: 100.00%

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 2 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval report.

hooonc commented 1 year ago

@cryptowhizzard @psh0691 We were new to onboarding, so there were a lot of unknowns. We thought that we needed to finish uploading at one node before we could upload to another, so we were only onboarding one node. We will start onboarding another regions in Korea early next week by selecting two nodes from the node list. In addition, regarding other regional SPs, we have been introduced to several regional SPs by the Protocol Labs Foundation and are currently negotiating data storage fees with them. We will update the node list as soon as everything is finalized.

cryptowhizzard commented 1 year ago

Well, how about that you fix the retrieval first for your machine?

Scherm­afbeelding 2023-07-06 om 15 16 06

hooonc commented 1 year ago

@cryptowhizzard A failure occurred while changing the software settings. We will make sure that the correction is completed by the end of the day.

psh0691 commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 90% of total datacap - f01227505: 100.00%

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 2 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval report.

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 2

Multisig Notary address

f02049625

Client address

f1pkpa3h4suh77mi3guur25r5u4twkere5r35fjii

DataCap allocation requested

100TiB

Id

c0ab6e55-abfa-4b84-ada2-3820cd306326

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f02049625

Client address

f1pkpa3h4suh77mi3guur25r5u4twkere5r35fjii

Rule to calculate the allocation request amount

100% of weekly dc amount requested

DataCap allocation requested

100TiB

Total DataCap granted for client so far

50TiB

Datacap to be granted to reach the total amount requested by the client (1PiB)

974TiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
1151 1 50TiB 100 12.56TiB
hooonc commented 1 year ago

We are copying and transferring data to other regional SPs. We will start saving as soon as the copy is finished.

cryptowhizzard commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 70% of total datacap - f01227505: 100.00%

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 3 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

spaceT9 commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 70% of total datacap - f01227505: 100.00%

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 3 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

hooonc commented 1 year ago

We have confirmed overseas SPs and will start onboarding their equipment as soon as the necessary preparations are completed.

spaceT9 commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 70% of total datacap - f01227505: 100.00%

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 3 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!

clriesco commented 1 year ago

Removed stale label and reopened issue :)

hooonc commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 70% of total datacap - f01227505: 100.00%

⚠️ All storage providers are located in the same region.

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 3 storage providers.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

-- Commented by Stale Bot.