filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application] <FogMeta Lab> - <End of Term Web Archive Datasets> #1600

Open hengdingy opened 1 year ago

hengdingy commented 1 year ago

Data Owner Name

FogMeta Lab

Data Owner Country/Region

China

Data Owner Industry

Web3 / Crypto

Website

https://fogmeta.com

Social Media

Twitter: https://twitter.com/FogMeta
GitHub: https://github.com/FogMeta

Total amount of DataCap being requested

5PiB

Weekly allocation of DataCap requested

500TiB

On-chain address for first allocation

f1o6rcx5wky2qy54kd6l6l5zj36uq7ahhl2dt7xba

Custom multisig

Identifier

No response

Share a brief history of your project and organization

FogMeta Lab's research spans multiple levels from system technology, infrastructure, and middleware to services and solutions, and involves future systems, network technology and business, distributed systems and management, information management, and interactive and innovative services. Based on the views on and practices in the industry, FogMeta also solves the problem of business complexity through operations optimization and other technologies.

Is this project associated with other projects/ecosystem stakeholders?

No

If answered yes, what are the other projects/ecosystem stakeholders

No response

Describe the data being stored onto Filecoin

"The End of Term Web Archive (EOT) captures and saves U.S. Government websites at the end of presidential administrations. The EOT has thus far preserved websites from administration changes in 2008, 2012, 2016, and 2020. Data from these web crawls have been made openly available in several formats in this dataset."

Source: https://registry.opendata.aws/eot-web-archive/

Where was the data currently stored in this dataset sourced from

AWS Cloud

If you answered "Other" in the previous question, enter the details here

No response

How do you plan to prepare the dataset

IPFS, lotus, graphsplit, others/custom tool

If you answered "other/custom tool" in the previous question, enter the details here

We also like to use the Swan Client tool (https://github.com/filswan/go-swan-client#Graphsplit) to prepare the dataset.

Please share a sample of the data

https://eotarchive.org/data/data-2008/
https://eotarchive.org/data/data-2012/

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

Monthly

For how long do you plan to keep this dataset stored on Filecoin

2 to 3 years

In which geographies do you plan on making storage deals

Greater China, Asia other than Greater China, Africa, North America, South America, Europe, Australia (continent), Antarctica

How will you be distributing your data to storage providers

Cloud storage (i.e. S3), HTTP or FTP server, IPFS, Shipping hard drives, Others

How do you plan to choose storage providers

Slack, Partners, Others

If you answered "Others" in the previous question, what is the tool or platform you plan to use

FilSwan platform (https://filswan.com/) is another good choice for us to choose storage providers who meet our requirements.

If you already have a list of storage providers to work with, fill out their names and provider IDs below

The storage providers we'd like to work with are presented below. Some of them are from the FilSwan platform.
f03624
f010088
f02301
f08399
f02401
f0187709
f01163272
f01402814
f01072221
f0240185
f0143858
f01390330
f01225882
f0717969
f03223
f01395673
f01786736
f0836160
f032824
f01443744
f01871352
f01907556
f01946551
f02012951
f01970630

How do you plan to make deals to your storage providers

Boost client, Lotus client, Others/custom tool

If you answered "Others/custom tool" in the previous question, enter the details here

https://github.com/filswan/go-swan-client

Can you confirm that you will follow the Fil+ guideline

Yes

large-datacap-requests[bot] commented 1 year ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

Sunnyiscoming commented 1 year ago

Waiting for process in #1598 #984

cryptowhizzard commented 1 year ago

Dear applicant,

Thank you for applying for datacap. As Filecoin FIL+ notary i am screening your application and conducting due diligence.

I would like you to fill out this form to provide us with the necessary information to make a educated decision on your LDN request if we would like to support it.

Thanks!

simonkim0515 commented 1 year ago

Datacap Request Trigger

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

500TiB

Client address

f1o6rcx5wky2qy54kd6l6l5zj36uq7ahhl2dt7xba

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address

f01858410

Client address

f1o6rcx5wky2qy54kd6l6l5zj36uq7ahhl2dt7xba

DataCap allocation requested

250TiB

Id

65eef581-9b45-443c-ab24-07b350762e4f

hengdingy commented 1 year ago

Dear applicant,

Thank you for applying for datacap. As Filecoin FIL+ notary i am screening your application and conducting due diligence.

I would like you to fill out this form to provide us with the necessary information to make a educated decision on your LDN request if we would like to support it.

Thanks!

@cryptowhizzard Already filled. Thanks.

kernelogic commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacearlneafhb6pbw2z6gf7c57bhehp4hckkjkkenkcgtsw3smvbqix6

Address

f1o6rcx5wky2qy54kd6l6l5zj36uq7ahhl2dt7xba

Datacap Allocated

250.00TiB

Signer Address

f1yjhnsoga2ccnepb7t3p3ov5fzom3syhsuinxexa

Id

65eef581-9b45-443c-ab24-07b350762e4f

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacearlneafhb6pbw2z6gf7c57bhehp4hckkjkkenkcgtsw3smvbqix6

cryptowhizzard commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedins7hkwx6uoctznhyd3tarm3qr7tfowa6kb3od7dwahigo643ju

Address

f1o6rcx5wky2qy54kd6l6l5zj36uq7ahhl2dt7xba

Datacap Allocated

250.00TiB

Signer Address

f1krmypm4uoxxf3g7okrwtrahlmpcph3y7rbqqgfa

Id

65eef581-9b45-443c-ab24-07b350762e4f

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedins7hkwx6uoctznhyd3tarm3qr7tfowa6kb3od7dwahigo643ju

cryptowhizzard commented 1 year ago

KYC / Data onboarding plan done. Ready to start!

Looking forward to the first milestone!

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 2

Multisig Notary address

f02049625

Client address

f1o6rcx5wky2qy54kd6l6l5zj36uq7ahhl2dt7xba

DataCap allocation requested

500TiB

Id

b907eecc-056c-4366-a6be-59ae8b0c962a

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f1o6rcx5wky2qy54kd6l6l5zj36uq7ahhl2dt7xba

Last two approvers

cryptowhizzard & kernelogic

Rule to calculate the allocation request amount

100% of weekly dc amount requested

DataCap allocation requested

500TiB

Total DataCap granted for client so far

250TiB

Datacap to be granted to reach the total amount requested by the client (5PiB)

4.75PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
2519 6 250TiB 31.62 58.10TiB
filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

Normalnoise commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

zcfil commented 1 year ago

I randomly checked one SP and found that it could not be retrieved online. Please check. image

hengdingy commented 1 year ago

@zcfil All storage providers are from the Filswan platform, As the current design of FilSwan platform, not every node is guaranteed retrieval, we will send more copies to the storage providers. We will send more deals to the higher score SPs

nj-steve commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

nj-steve commented 1 year ago

@hengdingy Looks good! How much data do you have repaired?

hengdingy commented 1 year ago

@hengdingy Looks good! How much data do you have repaired?

@nj-steve we have prepared about 200T data.

nj-steve commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacecqat6e4qbcjxng4f2mwj5vx3afnasqfj64agadzn3g3pf433brd4

Address

f1o6rcx5wky2qy54kd6l6l5zj36uq7ahhl2dt7xba

Datacap Allocated

500.00TiB

Signer Address

f1xx6555qijma7igpnjspyvdunc4vfxkawnpqy5ii

Id

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecqat6e4qbcjxng4f2mwj5vx3afnasqfj64agadzn3g3pf433brd4

Bitrise0111 commented 1 year ago

The checker report and retrieval are both healthy

Bitrise0111 commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacebkipk26nict53x5ovkwgjhqi24fygke5x5rtyr7pn3tfyua5mwos

Address

f1o6rcx5wky2qy54kd6l6l5zj36uq7ahhl2dt7xba

Datacap Allocated

500.00TiB

Signer Address

f1nknj7ayq4o43czrtdoauggtwl43fbqatmqis3yy

Id

b907eecc-056c-4366-a6be-59ae8b0c962a

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebkipk26nict53x5ovkwgjhqi24fygke5x5rtyr7pn3tfyua5mwos

Normalnoise commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 3

Multisig Notary address

f02049625

Client address

f1o6rcx5wky2qy54kd6l6l5zj36uq7ahhl2dt7xba

DataCap allocation requested

1000.0TiB

Id

3c77d92d-f275-4e59-bac2-90ffe210c908

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f02049625

Client address

f1o6rcx5wky2qy54kd6l6l5zj36uq7ahhl2dt7xba

Rule to calculate the allocation request amount

200% of weekly dc amount requested

DataCap allocation requested

1000.0TiB

Total DataCap granted for client so far

454747.4YiB

Datacap to be granted to reach the total amount requested by the client (5PiB)

-5.49B

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
12911 25 500TiB 27.23 125.23TiB
Normalnoise commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

zcfil commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

herrehesse commented 1 year ago

CID report looks very healthy. Distribution is OK. Will ask @cryptowhizzard to do retrieval testing, if this is in order willing to support the next round.

kernelogic commented 1 year ago

CID report LGTM

kernelogic commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceakkjhqbe2eqyhxano36j6qom5fbknuhdndt6nhhix6xg4kyzrqgm

Address

f1o6rcx5wky2qy54kd6l6l5zj36uq7ahhl2dt7xba

Datacap Allocated

1000.00TiB

Signer Address

f1yjhnsoga2ccnepb7t3p3ov5fzom3syhsuinxexa

Id

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceakkjhqbe2eqyhxano36j6qom5fbknuhdndt6nhhix6xg4kyzrqgm

zcfil commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

zcfil commented 1 year ago

image

zcfil commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacebvnuyebz4pq7r5xxrwfsz4t4rsnhgztt5sxwvk5vnowzigqgfsvu

Address

f1o6rcx5wky2qy54kd6l6l5zj36uq7ahhl2dt7xba

Datacap Allocated

1000.00TiB

Signer Address

f1cjzbiy5xd4ehera4wmbz63pd5ku4oo7g52cldga

Id

3c77d92d-f275-4e59-bac2-90ffe210c908

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebvnuyebz4pq7r5xxrwfsz4t4rsnhgztt5sxwvk5vnowzigqgfsvu

zcfil commented 1 year ago

passed this review, and will continue to follow in the future.

hengdingy commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the full report.

Chris00618 commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval report.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

hengdingy commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval Dashboard. Click here to view the Retrieval report.

Normalnoise commented 1 year ago

checker:manualTrigger