filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application] Outercore - Delta Instance Three #1854

Closed corinne-antonia closed 1 year ago

corinne-antonia commented 1 year ago

Data Owner Name

Outercore - Network Growth - Engineering

Data Owner Country/Region

United States

Data Owner Industry

Other

Website

https://fw.services

Social Media

https://twitter.com/OutercoreEng
https://twitter.com/Estuary_Tech

Total amount of DataCap being requested

2PiB

Weekly allocation of DataCap requested

200TiB

On-chain address for first allocation

f1z5ykhx7qi2jeukwlve3mu2zbcuzrtewhc3iajmi

Data Type of Application

None

Custom multisig

Identifier

No response

Share a brief history of your project and organization

Root website: https://fw.services
Project website: https://delta.store

We are the Engineering team at Protocol Labs ➝ Outercore ➝ Engineering/Network Growth. Our goal is develop tooling for entire Filecoin Ecosystem to use to onboard data to the Filecoin Network.

This is the third application for our tool Delta. The first one is located here. https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/160. The second one is here: https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1602

#### Our mission

We want an internet where resources are owned & shared by everyone. Everyone shares consensus over a distributed ledger. Data storage is verifiably stored for the end-user, and more fault tolerant. There is no central authority of control and no single point of failure. And the security promise only improves as the network continues to grow. Just imagine all of the certifications that can be automatically generated for end users by a new network like this. This is how you achieve storage as a human right for anyone in the world.

We're going to prove that we can get here by building the tools that can support onboarding 10 PiBs a day of data to the Filecoin Network. But we're also not going to forget about retrievability and helping the Filecoin Network actually become usable.

#### Delta

Our solution to archival and cold storage use cases.

Use ∆ Delta to upload all of your useful public data to Filecoin storage providers. Delta is a straight-forward Filecoin storage deal making tool that manages deals, and does not do anything else. It is purely for the function of helping Storage Providers fill capacity either through online or offline methods. It is written in Go and designed to be paired well with bare-metal infrastructure.

Is this project associated with other projects/ecosystem stakeholders?

Yes

If answered yes, what are the other projects/ecosystem stakeholders

Protocol Labs -> Outercore -> Network Growth

Describe the data being stored onto Filecoin

Any useful datasets that have been listed by the Network Growth team has tracked in their client tracker.

We also have our own PM going after clients such as the Cancer Imaging Archive, University Data, Climate Data, AI Data, Healthcare data, and so much more.Current companies/organizations we have been in talks with to onboard their data include: 
- Neuromorpho.org
- Radiant Earth Foundation
- CalaAdapt Analytics Engine
- The GDELT Project
- DANDI (Neuroimaging data from MIT)
- Abstrakta.org
- World Ethical Data Forum
- The WCRP
- PUDL
- Artizyou
- Animoca Brands
- CXIDB
- WebRecorder working w/ SUCHO (Ukraine digital preservation)
- Blockdaemon
- STRENDA, Beilstein-Institut
- Digital Corpora
- Archival Community from Starling Lab
- ClubNFT

Note that all these are open data

The total amount of data held by these organizations listed is over 350+ Petabytes.

We have contacted over 300+ organizations.

You can track our progress and see a breakdown of each comapny/organization here: https://docs.google.com/spreadsheets/d/1OnqXfA8FTDSDFbSJ5mBi6rPhmQKFGGd5n9RBTK0KHew/edit?usp=sharing

Where was the data currently stored in this dataset sourced from

Other

If you answered "Other" in the previous question, enter the details here

Differs from client to client

* University hard drives
* Backup services
* Cloud

How do you plan to prepare the dataset

others/custom tool

If you answered "other/custom tool" in the previous question, enter the details here

Our new tool Delta, which we plan to release to the whole ecosystem.

Learn more at https://delta.store

Please share a sample of the data

Some examples below, all open data:

Radiant Earth Data: https://mlhub.earth/
Neuromorpho: https://neuromorpho.org/byspecies.jsp#top
Cal-Adapt Analytics Engine: https://cal-adapt.org/data/download/
WebRecorder: https://webrecorder.net/

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

Delta is not focused on retrieval, its focused on storage onboarding. We are going to help all SPs onboard all of their data.

Estuary is focused on retrieval.

What is the expected retrieval frequency for this data

Yearly

For how long do you plan to keep this dataset stored on Filecoin

More than 3 years

In which geographies do you plan on making storage deals

North America

How will you be distributing your data to storage providers

Others

How do you plan to choose storage providers

Others

If you answered "Others" in the previous question, what is the tool or platform you plan to use

We will be developing our tool Delta

Learn more at https://delta.store

If you already have a list of storage providers to work with, fill out their names and provider IDs below

We have developed our own list at https://data.storage.market but we will work with everyone/anyone. We actively work with storage providers and we have a storage provider on our team.

93 of them that we plan to start with, obviously we'll work with more.

➝ https://data.storage.market/api/providers/f0840770
➝ https://data.storage.market/api/providers/f01624021
➝ https://data.storage.market/api/providers/f01806491
➝ https://data.storage.market/api/providers/f08399
➝ https://data.storage.market/api/providers/f01790264
➝ https://data.storage.market/api/providers/f0875769
➝ https://data.storage.market/api/providers/f01035680
➝ https://data.storage.market/api/providers/f033356
➝ https://data.storage.market/api/providers/f01683871
➝ https://data.storage.market/api/providers/f03488
➝ https://data.storage.market/api/providers/f030379
➝ https://data.storage.market/api/providers/f01466075
➝ https://data.storage.market/api/providers/f02301
➝ https://data.storage.market/api/providers/f010479
➝ https://data.storage.market/api/providers/f010088
➝ https://data.storage.market/api/providers/f0773157
➝ https://data.storage.market/api/providers/f0717969
➝ https://data.storage.market/api/providers/f0461791
➝ https://data.storage.market/api/providers/f01746964
➝ https://data.storage.market/api/providers/f01059489
➝ https://data.storage.market/api/providers/f023467
➝ https://data.storage.market/api/providers/f01392893
➝ https://data.storage.market/api/providers/f01736668
➝ https://data.storage.market/api/providers/f022352
➝ https://data.storage.market/api/providers/f01199430
➝ https://data.storage.market/api/providers/f09848
➝ https://data.storage.market/api/providers/f01222595
➝ https://data.storage.market/api/providers/f01794610
➝ https://data.storage.market/api/providers/f01443744
➝ https://data.storage.market/api/providers/f01199442
➝ https://data.storage.market/api/providers/f02401
➝ https://data.storage.market/api/providers/f01402814
➝ https://data.storage.market/api/providers/f0104671
➝ https://data.storage.market/api/providers/f01207045
➝ https://data.storage.market/api/providers/f0406703
➝ https://data.storage.market/api/providers/f01175097
➝ https://data.storage.market/api/providers/f010446
➝ https://data.storage.market/api/providers/f0724219
➝ https://data.storage.market/api/providers/f022142
➝ https://data.storage.market/api/providers/f058369
➝ https://data.storage.market/api/providers/f01201327
➝ https://data.storage.market/api/providers/f0406322
➝ https://data.storage.market/api/providers/f01278
➝ https://data.storage.market/api/providers/f0187709
➝ https://data.storage.market/api/providers/f01045784
➝ https://data.storage.market/api/providers/f0706693
➝ https://data.storage.market/api/providers/f024184
➝ https://data.storage.market/api/providers/f01385207
➝ https://data.storage.market/api/providers/f01652333
➝ https://data.storage.market/api/providers/f01319368
➝ https://data.storage.market/api/providers/f0214334
➝ https://data.storage.market/api/providers/f082635
➝ https://data.storage.market/api/providers/f0836160
➝ https://data.storage.market/api/providers/f0135078
➝ https://data.storage.market/api/providers/f039940
➝ https://data.storage.market/api/providers/f0408717
➝ https://data.storage.market/api/providers/f01345523
➝ https://data.storage.market/api/providers/f01108096
➝ https://data.storage.market/api/providers/f01611097
➝ https://data.storage.market/api/providers/f01208862
➝ https://data.storage.market/api/providers/f01662356
➝ https://data.storage.market/api/providers/f01423116
➝ https://data.storage.market/api/providers/f017665
➝ https://data.storage.market/api/providers/f01666984
➝ https://data.storage.market/api/providers/f0440429
➝ https://data.storage.market/api/providers/f01028552
➝ https://data.storage.market/api/providers/f0707721
➝ https://data.storage.market/api/providers/f099608
➝ https://data.storage.market/api/providers/f0142637
➝ https://data.storage.market/api/providers/f01127678
➝ https://data.storage.market/api/providers/f0501283
➝ https://data.storage.market/api/providers/f01133080
➝ https://data.storage.market/api/providers/f0127896
➝ https://data.storage.market/api/providers/f010617
➝ https://data.storage.market/api/providers/f097777
➝ https://data.storage.market/api/providers/f01367109
➝ https://data.storage.market/api/providers/f01225882
➝ https://data.storage.market/api/providers/f023971
➝ https://data.storage.market/api/providers/f0466405
➝ https://data.storage.market/api/providers/f01310564
➝ https://data.storage.market/api/providers/f02620
➝ https://data.storage.market/api/providers/f02576
➝ https://data.storage.market/api/providers/f01764587
➝ https://data.storage.market/api/providers/f019551
➝ https://data.storage.market/api/providers/f034258
➝ https://data.storage.market/api/providers/f0240185
➝ https://data.storage.market/api/providers/f01619524
➝ https://data.storage.market/api/providers/f01096124
➝ https://data.storage.market/api/providers/f01163272
➝ https://data.storage.market/api/providers/f01337533
➝ https://data.storage.market/api/providers/f01091851
➝ https://data.storage.market/api/providers/f01784458
➝ https://data.storage.market/api/providers/f08403

How do you plan to make deals to your storage providers

Others/custom tool

If you answered "Others/custom tool" in the previous question, enter the details here

Our new tool Delta, which we plan to release to the whole ecosystem.

Learn more at https://delta.store

Can you confirm that you will follow the Fil+ guideline

Yes

large-datacap-requests[bot] commented 1 year ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
large-datacap-requests[bot] commented 1 year ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

dkkapur commented 1 year ago

@corinne-antonia can you use a different client address for this application?

cc @simonkim0515 we should backlog putting ^ this in the validation from the bot probs, since we want to make a push towards it

simonkim0515 commented 1 year ago

Hey @corinne-antonia, you can learn more about best practices for the LDN application process for Fil+ in the readme. If you look at the second bullet of the current scope section, it'll provide details about using a different client address for each application you create.

large-datacap-requests[bot] commented 1 year ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
large-datacap-requests[bot] commented 1 year ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

corinne-antonia commented 1 year ago

Ok @simonkim0515 and @dkkapur, just updated the client address!

large-datacap-requests[bot] commented 1 year ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
large-datacap-requests[bot] commented 1 year ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
simonkim0515 commented 1 year ago

Datacap Request Trigger

Total DataCap requested

2PiB

Expected weekly DataCap usage rate

200TiB

Client address

f1z5ykhx7qi2jeukwlve3mu2zbcuzrtewhc3iajmi

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f1z5ykhx7qi2jeukwlve3mu2zbcuzrtewhc3iajmi

DataCap allocation requested

100TiB

Id

88fc74f0-4a96-46d6-a11e-9fca69e2f929

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f1z5ykhx7qi2jeukwlve3mu2zbcuzrtewhc3iajmi

DataCap allocation requested

100TiB

Id

93a36619-2713-404d-a23f-d7dab5787eb9

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

No application info found for this issue on https://filplus.d.interplanetary.one/clients.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

No application info found for this issue on https://filplus.d.interplanetary.one/clients.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

corinne-antonia commented 1 year ago

Hi @simonkim0515 and @dkkapur, what is the expected timeline on this to get approved? Thanks!

TimGuo7 commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacebbdhqpslnmgwqdeshua3ucswjgzb4vquv6owftcxwomlvu2a34du

Address

f1z5ykhx7qi2jeukwlve3mu2zbcuzrtewhc3iajmi

Datacap Allocated

100.00TiB

Signer Address

f1yslbnnqzrjlyuxsmyxfbqcc7xthcavgpripjevi

Id

93a36619-2713-404d-a23f-d7dab5787eb9

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebbdhqpslnmgwqdeshua3ucswjgzb4vquv6owftcxwomlvu2a34du

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

No application info found for this issue on https://filplus.d.interplanetary.one/clients.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

Carohere commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzaceapmhhew3cl2k33xangryxn4whzokdvpejgfz65xh7skmx72iugqw

Address

f1z5ykhx7qi2jeukwlve3mu2zbcuzrtewhc3iajmi

Datacap Allocated

100.00TiB

Signer Address

f1pynuve3pi2fwvhlkco6dl3ipspkgzmb25syv5ca

Id

93a36619-2713-404d-a23f-d7dab5787eb9

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceapmhhew3cl2k33xangryxn4whzokdvpejgfz65xh7skmx72iugqw

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to re-open if this is relevant, or start a new application for DataCap anytime. Thank you!

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Request number 2

Multisig Notary address

f02049625

Client address

f1z5ykhx7qi2jeukwlve3mu2zbcuzrtewhc3iajmi

DataCap allocation requested

200TiB

Id

7117976d-1634-4c7c-93d8-af39392f0757

large-datacap-requests[bot] commented 1 year ago

Stats & Info for DataCap Allocation

Multisig Notary address

f02049625

Client address

f1z5ykhx7qi2jeukwlve3mu2zbcuzrtewhc3iajmi

Rule to calculate the allocation request amount

100% of weekly dc amount requested

DataCap allocation requested

200TiB

Total DataCap granted for client so far

100TiB

Datacap to be granted to reach the total amount requested by the client (2PiB)

1.90PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
6015 21 100TiB 67.46 6.95GiB
kernelogic commented 1 year ago

DD performed in slack DM. It is acceptable for me.

image image
kernelogic commented 1 year ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceaujhv3vzk3x55wu4bl5wuitlqczljjnx4r46hf4ghgb6222xonb6

Address

f1z5ykhx7qi2jeukwlve3mu2zbcuzrtewhc3iajmi

Datacap Allocated

200.00TiB

Signer Address

f1yjhnsoga2ccnepb7t3p3ov5fzom3syhsuinxexa

Id

7117976d-1634-4c7c-93d8-af39392f0757

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceaujhv3vzk3x55wu4bl5wuitlqczljjnx4r46hf4ghgb6222xonb6

spaceT9 commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 50% of total datacap - f02099116: 67.36%

⚠️ 1 storage providers sealed too much duplicate data - f0717969: 66.67%

Deal Data Replication

⚠️ 99.99% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval report.

cryptowhizzard commented 1 year ago

Retrieval rate improving, known Engineering team at Protocol Labs. Received in depth Client tracking data with all institutions involved. Willing to support both applications.

cryptowhizzard commented 1 year ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacecuftc4ut4cad3tamopz5jl4xewblhvnksvpvrs6frkjgmxq22pq2

Address

f1z5ykhx7qi2jeukwlve3mu2zbcuzrtewhc3iajmi

Datacap Allocated

200.00TiB

Signer Address

f1krmypm4uoxxf3g7okrwtrahlmpcph3y7rbqqgfa

Id

7117976d-1634-4c7c-93d8-af39392f0757

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecuftc4ut4cad3tamopz5jl4xewblhvnksvpvrs6frkjgmxq22pq2

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!

aggregation-and-compliance-bot[bot] commented 11 months ago
Client f02096010 does not follow the datacap usage rules. More info here. This application has been failing the requirements for 7 days. Please take appropiate action to fix the following DataCap usage problems. Criteria Treshold Reason
Cid Checker score > 25% The client has a CID checker score of 0%. This should be greater than 25%. To find out more about CID checker score please look at this issue: https://github.com/filecoin-project/notary-governance/issues/986
Shared data percent < 20% 32.54% of the clients data is shared with other clients. This should be less than 20%