filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
109 stars 62 forks source link

[DataCap Application] Momentum Lab at University of Alberta - Security and Privacy Research Open Dataset #1151

Closed hitum-dev closed 1 year ago

hitum-dev commented 2 years ago

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

Deep Learning (DL) has achieved tremendous success in many cutting-edge applications. However, the state-of-the-art DL systems still suffer from many quality issues. Our project is to conduct quantitative analysis of the vulnerability of security and privacy in DL systems. This project come from the Momemtum Lab, Univesity of Alberta. The mission of Momemtum Lab is to propose novel quality assuarance and engineering support for building trustworthy AI  systems. The Momentum Lab currently has more than 30 members located in Canada, Japan, Singapore and Germany.

What is the primary source of funding for this project?

From ourselves.

What other projects/ecosystem stakeholders is this project associated with?

Only related to ourselves.

Use-case details

Describe the data being stored onto Filecoin

The text, image and audio data that investigated by our project.

Where was the data in this dataset sourced from?

Text, Image and Audio Data: Part of the data is from our own database, the other part is generated by using our algorithms. 

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

Yes, you can find our sample dataset from the links listed below:

-text data: https://drive.google.com/drive/folders/1PLxDyosue-IYWUwKEcpbe4GO4DsElMdz?usp=sharing
-audio data: https://drive.google.com/drive/folders/1eUFuSUu75B_VSuwjkfg4OG0cNYvkOZC5?usp=sharing
-scenario data: https://drive.google.com/drive/folders/1k8WxZ3CKZfdBW74MlJ_1K3TtsdKdO_hz?usp=sharing
-video data: https://drive.google.com/drive/folders/1d1vfeYfESNG49QZucmNmV-d2p29sxcpC?usp=sharing

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

Yes, there are no restrictions. Anyone who wants to learn accounting knowledge can search it.

What is the expected retrieval frequency for this data?

almost every day

For how long do you plan to keep this dataset stored on Filecoin?

We hope it will last forever.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

We want to store the data in Canada, Japan, Singapore, Germany and China.

How will you be distributing your data to storage providers? Is there an offline data transfer process?

These data are currently stored in our enterprise private cloud. Because the amount of data is very huge, we hope to transmit offline.

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

We are researchers and would like share our data to other researchers. We hope that our provided dataset can benefit the researchers who lack of valuable data to conduct impact research. Firstly, we guarantee that the data will be distributed in as more as possible locations. It will be stored in at least 10 storage serve

How will you be distributing deals across storage providers?

Choose Offline transmission as far as possible.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Yes, we have enough money. However, we still don't know how to cut the document into a unified format according to the regulations, and we will continue to learn.
large-datacap-requests[bot] commented 2 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

simonkim0515 commented 1 year ago

Datacap Request Trigger

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

100TiB

Client address

f1lp56banlro2wjldagufiawkokdbd4lcxunixoyi

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f1lp56banlro2wjldagufiawkokdbd4lcxunixoyi

DataCap allocation requested

50TiB

Id

b2cca758-79c6-4952-a524-722c34464a95

cryptowhizzard commented 1 year ago

Dear applicant,

Thank you for applying for datacap. As Filecoin FIL+ notary i am screening your application and conducting due diligence.

Looking at your application i have some questions: As you are brand new on Github and have no history of past applications it seems to me that applying for 5PB of datacap is a lot. One needs comprehensive knowledge of Filecoin, packing of data, distribution of data and all it's requirements coming with it. Are you brand new in the Filecoin space or have you applied for datacap in the past on different Github account names?

Can you show us visible proof of the size of your data and the storage systems you have there?

As last question i would like you to fill out this form to provide us with the necessary information to make a educated decision on your LDN request if we would like to support it.

Thanks!

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!