filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application] Momentum Lab at University of Alberta - Security and Privacy Research Open Dataset #1150

Closed hitum-dev closed 2 years ago

hitum-dev commented 2 years ago

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

Deep Learning (DL) has achieved tremendous success in many cutting-edge applications. However, the state-of-the-art DL systems still suffer from many quality issues. Our project is to conduct quantitative analysis of the vulnerability of security and privacy in DL systems. This project come from the Momemtum Lab, Univesity of Alberta. The mission of Momemtum Lab is to propose novel quality assuarance and engineering support for building trustworthy AI  systems. The Momentum Lab currently has more than 30 members located in Canada, Japan, Singapore and Germany.

What is the primary source of funding for this project?

From ourselves.

What other projects/ecosystem stakeholders is this project associated with?

Only related to ourselves.

Use-case details

Describe the data being stored onto Filecoin

The text, image and audio data that investigated by our project.

Where was the data in this dataset sourced from?

Text, Image and Audio Data: Part of the data is from our own database, the other part is generated by using our algorithms. 

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

Yes, you can find our sample dataset from the links listed below:

-text data: https://drive.google.com/drive/folders/1PLxDyosue-IYWUwKEcpbe4GO4DsElMdz?usp=sharing
-audio data: https://drive.google.com/drive/folders/1eUFuSUu75B_VSuwjkfg4OG0cNYvkOZC5?usp=sharing
-scenario data: https://drive.google.com/drive/folders/1k8WxZ3CKZfdBW74MlJ_1K3TtsdKdO_hz?usp=sharing
-video data: https://drive.google.com/drive/folders/1d1vfeYfESNG49QZucmNmV-d2p29sxcpC?usp=sharing

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

Yes, there are no restrictions. Anyone who wants to learn accounting knowledge can search it.

What is the expected retrieval frequency for this data?

almost every day

For how long do you plan to keep this dataset stored on Filecoin?

We hope it will last forever.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

We want to store the data in Canada, Japan, Singapore, Germany and China.

How will you be distributing your data to storage providers? Is there an offline data transfer process?

These data are currently stored in our enterprise private cloud. Because the amount of data is very huge, we hope to transmit offline.

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

We are researchers and would like share our data to other researchers. We hope that our provided dataset can benefit the researchers who lack of valuable data to conduct impact research. Firstly, we guarantee that the data will be distributed in as more as possible locations. It will be stored in at least 10 storage serve

How will you be distributing deals across storage providers?

Choose Offline transmission as far as possible.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Yes, we have enough money. However, we still don't know how to cut the document into a unified format according to the regulations, and we will continue to learn.
large-datacap-requests[bot] commented 2 years ago

Thanks for your request! :exclamation: We have found some problems in the information provided.