fidlabs / Enterprise-Data-Pathway

1 stars 1 forks source link

[DataCap Application] <Starling Lab> - <USC Shoah Foundation's Dimensions in Testimony> #62

Open anstco opened 3 months ago

anstco commented 3 months ago

Data Owner Name

The Starling Lab

Data Owner Country/Region

United States

Data Owner Industry

Education & Training

Website

https://www.starlinglab.org/

Social Media Handle

https://www.linkedin.com/company/starlinglab/

Social Media Type

Other

What is your role related to the dataset

Dataset Owner

Total amount of DataCap being requested

5.6 PiB

Expected size of single dataset (one copy)

2.8 PiB

Number of replicas to store

2

Weekly allocation of DataCap requested

500 TiB

On-chain address for first allocation

f1p2p3e6gv6vygtouazdcb4757vh5leylcxggzkbq

Data Type of Application

Private Non-Profit / Social impact

Custom multisig

Identifier

No response

Share a brief history of your project and organization

The Stanford / USC Starling Lab for Data Integrity is the first academic lab focused on applied research on web3 and human rights.

Is this project associated with other projects/ecosystem stakeholders?

Yes

If answered yes, what are the other projects/ecosystem stakeholders

DARMA Capital will assist with the Initial Pledge

Describe the data being stored onto Filecoin

Starling Lab will host curated collections in collaboration with USC Libraries. Initial collections include the USC Shoah Foundation's Dimensions in Testimony, which include XR and volumetric video footage.

Where was the data currently stored in this dataset sourced from

My Own Storage Infra

If you answered "Other" in the previous question, enter the details here

No response

If you are a data preparer. What is your location (Country/Region)

None

If you are a data preparer, how will the data be prepared? Please include tooling used and technical details?

No response

If you are not preparing the data, who will prepare the data? (Provide name and business)

We're working with ecosystem partners like PiKNiK to prepare the data into CAR files etc.

Has this dataset been stored on the Filecoin network before? If so, please explain and make the case why you would like to store this dataset again to the network. Provide details on preparation and/or SP distribution.

N/A

Please share a sample of the data

https://sfi.usc.edu/dit

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

This footage is available for researchers who are approved by USC.

What is the expected retrieval frequency for this data

Sporadic

For how long do you plan to keep this dataset stored on Filecoin

More than 3 years

In which geographies do you plan on making storage deals

North America

How will you be distributing your data to storage providers

Others

How did you find your storage providers

Partners

If you answered "Others" in the previous question, what is the tool or platform you used

No response

Please list the provider IDs and location of the storage providers you will be working with.

1. Krates AI - North America

How do you plan to make deals to your storage providers

Boost client

If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline

Yes

datacap-bot[bot] commented 3 months ago

Application is waiting for allocator review

kevzak commented 3 months ago

Hi @anstco thank you for applying.

A few questions:

So you have a 2.8 PiB dataset. Are the four copies supposed to be 11.2 PiBs total? You listed 5.6, just want to confirm the total ask.

Can you confirm if these previous LDN applications are related to this dataset just to understand what has been stored already? https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/53 https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/2085

Can you confirm SPs involved with this project? We ask for minerID, entity name, location. Two copies and two entities are required.

anstco commented 3 months ago

Hi @kevzak great to hear from you again.

  1. 5.6 total. I could not choose a value lower than 4.
  2. Confirmed these are net-new data cap applications. This is specific to the USC Libraries and contains some Shoah Foundation data as well.
  3. f03046733 - Krates AI - North America
kevzak commented 3 months ago

OK, great @anstco. Is there a second entity/SP involved in storing 2nd copy? Need at least 2

anstco commented 3 months ago

Hi @kevzak

Here's the updated list with two copies:

  1. f03046733 - USC - North America
  2. f03112580 - Krates AI - North America
kevzak commented 3 months ago

Thanks @anstco - last step, I ask for is KYB (Business Check) of your client. Can you please complete this form for our records?

After, as a trusted client you are eligible for 5% of total request as a first allocation (300TiB). As each allocation reaches 75% usage, the deals will be reviewed and datacap topped off in larger allocations.

datacap-bot[bot] commented 3 months ago

Datacap Request Trigger

Total DataCap requested

5.6 PiB

Expected weekly DataCap usage rate

500 TiB

DataCap Amount - First Tranche

300TiB

Client address

f1p2p3e6gv6vygtouazdcb4757vh5leylcxggzkbq

datacap-bot[bot] commented 3 months ago

DataCap Allocation requested

Multisig Notary address

Client address

f1p2p3e6gv6vygtouazdcb4757vh5leylcxggzkbq

DataCap allocation requested

300TiB

Id

62e198f1-ae18-4ee2-b802-cd3b404298e6

datacap-bot[bot] commented 3 months ago

Application is ready to sign

anstco commented 3 months ago

Thanks @kevzak. KYB Application submitted.

kevzak commented 3 months ago

Confirmed

datacap-bot[bot] commented 3 months ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedg27krs46pdahyhg2fbx2ycrldfigc42ryoopzzp6mjkiqcaseks

Address

f1p2p3e6gv6vygtouazdcb4757vh5leylcxggzkbq

Datacap Allocated

300TiB

Signer Address

f1v24knjbqv5p6qrmfjj5xmlaoddzqnon2oxkzkyq

Id

62e198f1-ae18-4ee2-b802-cd3b404298e6

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedg27krs46pdahyhg2fbx2ycrldfigc42ryoopzzp6mjkiqcaseks

datacap-bot[bot] commented 3 months ago

Application is Granted

martplo commented 2 days ago

checker:manualTrigger

datacap-bot[bot] commented 2 days ago

DataCap and CID Checker Report Summary[^1]

Storage Provider Distribution

⚠️ 1 storage providers sealed more than 90% of total datacap - f03046733: 100.00%

⚠️ 1 storage providers have unknown IP location - f03046733

⚠️ All storage providers are located in the same region.

⚠️ 100.00% of Storage Providers have retrieval success rate equal to zero.

⚠️ 100.00% of Storage Providers have retrieval success rate less than 75%.

⚠️ The average retrieval success rate is 0.00%

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients[^3]

✔️ No CID sharing has been observed.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report.