filecoin-project / notary-governance

114 stars 58 forks source link

Proposal: The New Ads - 1 PiB Dataset / 5 PiB Datacap #573

Closed Gonzalo1987 closed 1 year ago

Gonzalo1987 commented 2 years ago

Objective

The objective of this issue is to propose onboarding a total of 5 PiBs of data on the Filecoin network. This will represent 5 replicas of a 1 PiB dataset.

Project Description

The 1 PiB dataset is the first portion of our +3PiB dataset. We are asking for a datacap for 1 PiB as a proof of concept first. If we have found that the first tranche of onboarding data on Filecoin to be successful, then we will submit another issue to onboard the rest of the dataset.

The reason why the data is encrypted is because every organization (SPs in this case) that stores data from European institutions or individuals will need to comply with GDPR. But encryption is called out in Article 32 as an “appropriate technical and organizational measure” of security. For this reason, we will keep the keys for the encrypted data. This means the SPs does not need to be too concerned with GDPR compliance as we will have control of the data. Here is a link for more info: https://gdpr.eu/faq/.

Data Set

This project is a collection of different forms of data (mainly video) from our client list which extends to more than 1,400. From the Spanish football team Barcelona F.C, Deutsche Bank, Volkswagen, and Opel, to The Parliament of “La Rioja” which is a public institution in Spain.

Here are 5 samples of the data:

http://gofile.me/3XZGR/lZ79X3MRp http://gofile.me/3XZGR/Qft0AGwzU http://gofile.me/3XZGR/qLrjekXYk http://gofile.me/3XZGR/0gbiaYqRW http://gofile.me/3XZGR/qYXd5MWX5

Transparently in KYC

The New Ads is a dynamic communication agency. Founded in 2013, we are a part of a larger aggregation of groups under the Workanda Group. Workanda is made up of a group of companies from La Rioja, Spain, to offer a comprehensive service through IT services and 360º Communication. Custom programming and software, web development and training, audiovisual and design services.

The New Ads provides services such as social media management, web design, youtube ads, logo design, digital signage, advertising in cinemas and much more.

We are open to any KYC process needed to verify us as Filecoin+ clients.

Data Storage Plan

We are searching for reputable SPs in the FIlecoin community and welcome any SPs to take on this project as well. Features that would be more attractive in a storage provider would be fast-retrieval (in the case that we need to revise files). We have not decided on the quantity of allocation for each SPs. That would be dependent on features such as location, data center quality and sealing speed.

Notaries that support the project

We are currently contacting notaries in the Filecoin community to gain support of this project. We expect notaries that support this project to express support via remote or comment.

Online presence

Website: https://thenewads.com/proyectos/

The New Ads association with Workanda:

Website: https://workanda.es/the-new-ads/

Original Application:

https://github.com/filecoin-project/notary-governance/issues/489

kernelogic commented 2 years ago

Will you allow supporting notaries have full access to all data stored under this dataset (With a NDA )?

Gonzalo1987 commented 2 years ago

Hi @kernelogic. We cannot provide access but here is the explanation:

The GDPR (General Data Protection Regulation) is a regulation created to protect the privacy of companies and individuals residing in the EU. The New Ads has more than 1400 clients, so we would have to obtain the approval of each one of them to be able to share the encryption keys. For reasons you can understand, this process is unviable for us.

From The New Ads we are willing to present our company through a video call or to send the necessary documentation to any notary that can support our application.

Thank you.

Gonzalo1987 commented 2 years ago

Hello @Fenbushi-Filecoin! Since you have supported our application (https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/464#issuecomment-1175382692), can you confirm it in this new issue? We had to open a new one on this repository

Gonzalo1987 commented 2 years ago

Hello @Kakkouii! What are your thoughts on our application? We answered your question in the last issue (https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/464#issuecomment-1175878064 ). We had to open a new one in this repository so maybe we can continue with the process.

kernelogic commented 2 years ago

I don't know how I can support this without access/verify to any of the data stored. Sorry. Maybe someone in Europe can support you.

Gonzalo1987 commented 2 years ago

We could convince a few clients in order to give notaries access to a piece of the whole dataset. How this process would work?

Also, we are committed to have face to face meetings with notaries and give you technical details of the data transfers.

@kernelogic I have seen a couple of datacap applications of encrypted datasets such as https://github.com/filecoin-project/notary-governance/issues/564#issue-1283745530. They got your support, so what are they providing that we not?

xinaxu commented 2 years ago

I have performed KYC with the underlying client (The New Ads) and have received a signed confirmation email from the leadership. Also talked with the applicant about the use case and SP distribution. Everything looks good and I give my full support.

kernelogic commented 2 years ago

@Gonzalo1987 To answer your question what they (Project Beacon, Victor Chang Cardiac Research) provide and you do not, is the access to the encrypted sealed data under a NDA, as the supporting notary.

As you mentioned it is not possible to get NDA from your 1400 clients, I want to see what other notaries say about this.

bmcnabb25 commented 2 years ago

@Gonzalo1987 I would be willing to support this app. Although, can you elaborate on how you plan on distributing your data (i.e. which service providers from which countries)?

Destore2023 commented 2 years ago

image

Thanks for your request. Please provide more evidence to show that the real data owners are positive about this proposal.

cryptowhizzard commented 2 years ago

I have done KYC here with the client and underlying client ( The New Ads ). Everything looks good , you have my support here.

NiwanDao commented 2 years ago

I have done KYC offline with the client. They proved they had shipped the data to intending storage provider, and they have a distribution plan going forward. I will support it.

Gonzalo1987 commented 2 years ago

Hello everyone! We have found 5 notaries that are supporting our application:

Additionally, we confirm that our main goal is to have the dataset distributed across mutilple regions, and accordingly, the SPs selected are the following:

We have already agreed terms with SPs, such as sealing speed, time horizon (540 days), regulation (GDPR) concerns, retrieval speed and data distribution.

Regarding this last topic, we already shipped the hard drives from Spain to PiKNiK (US), and they are going to distribute the data using a 100 GB connection.

Also, we confirm that each SP is going to store 1 copy and receive the same amount of datacap.

Thanks and have a good day!

Carohere commented 2 years ago

Hi @Gonzalo1987, just noticed that the original application for this is #489. But the dataset descriptions are very different. Could you explain the relevance between these two apps and the relation between your org and Seal Tec.?

Besides, Project antarctic which is 5PiB*10 in total has already been approved long time ago. But somehow most of the ten applications have not been allocated since May. There's plenty of DC left there and I don't get why are you reapplying.

It would be great if you could help me understand these two points. Thanks!

jamerduhgamer commented 2 years ago

Hi @Carohere, this proposal is not for #489, it should be linked to https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/464. They are a separate entity than Seal!

Carohere commented 2 years ago

image

Hey @jamerduhgamer,I saw that your company is working as sp in this project and probably that's why u know more information. Thanks for the active engagement. But the original application was filled out by Gonzalo himself, and I think it would be better to hear from him directly.

Gonzalo1987 commented 2 years ago

Hi @carohere! The message sent by @jamerduhgamer is correct. This is just a typo from copying and pasting Seal’s proposal as a template, and our original application is filecoin-project/filecoin-plus-large-datasets#464. We have no relationship with Seal Storage and we are separate legal entities since The New Ads is an advertisement company located in Spain.