filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application] yesfile #406

Closed paar13kr closed 1 year ago

paar13kr commented 2 years ago

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

Our company combines customer service based on network technology and trust and is the largest content exchange in Korea.
It provides contents such as dramas, music, various videos, and books to customers. So We have a lot of videos, images, files.
We've been leading the Internet Webhard business, and we've been working on technology constantly
It is growing as a representative company in Korea and starting with file sharing service,
We will introduce new Peer-To-Peer-based technologies and products that will be adapted to the future E-Business environment

What is the primary source of funding for this project?

Company account

What other projects/ecosystem stakeholders is this project associated with?

None

Use-case details

Describe the data being stored onto Filecoin

We will store a lot of pictures, videos, and general files.
We will only store materials that are not copyright-related, and only samples will be stored for materials that are copyright-related.

Where was the data in this dataset sourced from?

Various broadcasting company data, customer uploaded data, company's own data.

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

http://image.yesfile.com/data/2016/06/23/1466613063_jOYb.jpg
https://image.yesfile.com/data/2021/07/29/fddc7be081d1365573d8a1d205ed8819.jpg
https://image.yesfile.com/data/2021/07/29/ws_72200030_300_1_40.jpg

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

Yes. confirm it

What is the expected retrieval frequency for this data?

We expect to retrieve our data every 5 months.

For how long do you plan to keep this dataset stored on Filecoin?

540 Days

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

Mostly in Asia. Korea will store 90% and other countries 10%.

How will you be distributing your data to storage providers? Is there an offline data transfer process?

Use offline services for data transfer or data disks, depending on the requirements of the SP providing the storage

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

We will choose a stable miner through due diligence. and We will choose places that are large and well-equipped.

How will you be distributing deals across storage providers?

We will select and save 2-3 miners for each region in Korea and keep a copy of the data.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Yes, we can start trading immediately once we receive the datacap.
large-datacap-requests[bot] commented 2 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

Kakkouii commented 2 years ago

How can you make sure that data uploaded by users are not copyright-related since you'd like to backup data from your users? Do you have a license from user that allow you to upload their data to Filecoin Network? ARE YOU A MINER?https://github.com/filecoin-project/lotus/discussions/8489 https://github.com/filecoin-project/lotus/discussions/8507

GaryGJG commented 2 years ago

How many data do you have now? what storage are you using to store the data ? what about your data increase rate ?

paar13kr commented 2 years ago

@EGGRICE02
How can you make sure that data uploaded by users are not copyright-related since you'd like to backup data from your users? -> We are filtering copyright data through copyright solutions and will upload non-copyright materials based on filtering.

Do you have a license from user that allow you to upload their data to Filecoin Network? -> Currently, we are going to choice and test the miner to upload the data, and we want to organize the data in the future and upload it with a license.

ARE YOU A MINER?filecoin-project/lotus#8489 filecoin-project/lotus#8507 -> Yes, I'm miner. I am interested in IPFS, and I am mining because I got to know the filecoin. As I ran Filecoin Miner, I found that by storing our company's data, I could reduce our company's storage and CDN costs.

paar13kr commented 2 years ago

@GaryGJG How many data do you have now? -> We have 3.9Pib data. However, not all data can be stored, and non-copyright data will be selected and stored. It is expected to be more than 1Pib.

what storage are you using to store the data ? -> We commissioned a storage company to use the data. The company is using on-premise storage. image

what about your data increase rate ? -> Data increases by about 12tib per week. image

Sunnyiscoming commented 2 years ago

@paar13kr Can you provide more information about the M&T organization? Such as registered address, time of establishment, proof of existence and so on. I can't find anything very specific on the website.

paar13kr commented 2 years ago

@Sunnyiscoming yesfile.com is a website for services only. A description of the company can be found on the company's website. company website : http://mntcompany.co.kr/ address: 703 HS TOWER 10F, Seolleung-ro, Gangnam-gu, Seoul time of establishment: 2010.03

UnionLabs2020 commented 2 years ago

At this time the large dataset process is not intended for storage providers to self-deal. In addition, could you provide more evidence that you have been authorized by the real owner of dataset?

Sunnyiscoming commented 2 years ago

@paar13kr Any update here?

paar13kr commented 2 years ago

@UnionLabs2020 What evidence should we provide, for example?

raghavrmadya commented 2 years ago

Hi @paar13kr , Can you share more about your SP distribution and how exactly are you filtering copyright data What is the copyright solution?

paar13kr commented 2 years ago

@raghavrmadya We are proceeding with filtering through murake, a Korean filtering company. (http://www.wisewall.co.kr/business/business_01.html) We send the DNA (file hash, file size, etc...) of the video data we have to the filtering company to check if it is a copyrighted video file. In order to prevent illegal uploads and downloads, we will first share non-copyright data among the data we have. My SP was made for IPFS understanding and testing purposes. Now, I'm not sealing, and I'm studying by looking at the contents of the file coin being updated.

Sunnyiscoming commented 2 years ago

Can you talk about SPs distribution?

paar13kr commented 2 years ago

@Sunnyiscoming We will visit the SP to verify the condition and stability of the facilities, and we will select 2 to 3 SPs for each region in Korea. We will distribute our data by region and organize it like CDN so that users can access it more quickly.

raghavrmadya commented 2 years ago

Datacap Request Trigger

Total DataCap requested

5 PiB

Expected weekly DataCap usage rate

100 TiB

Client address

f16aqr37qeoymlc6m7bc7gfjjgawqorbz7l7avlrq

large-datacap-requests[bot] commented 2 years ago

DataCap Allocation requested

Multisig Notary address

f01858410

Client address

f16aqr37qeoymlc6m7bc7gfjjgawqorbz7l7avlrq

DataCap allocation requested

50TiB

paar13kr commented 2 years ago

@raghavrmadya Thanks. We are planning to visit the SP in two region this week and inspect the facilities.

psh0691 commented 2 years ago

@paar13kr I'm Sunghwan Park, a notary in Korea. Let me ask you one more question.

Could you disclose the Filecoin node you have?

paar13kr commented 2 years ago

@psh0691 Yes, our node is f0214334.

psh0691 commented 2 years ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacedwkys2xrdlsyz23agv3j4nw4rxswgeznoej2672j6va7ml5c6msm

Address

f16aqr37qeoymlc6m7bc7gfjjgawqorbz7l7avlrq

Datacap Allocated

50.00TiB

Signer Address

f1qdko4jg25vo35qmyvcrw4ak4fmuu3f5rif2kc7i

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedwkys2xrdlsyz23agv3j4nw4rxswgeznoej2672j6va7ml5c6msm

paar13kr commented 2 years ago

@psh0691 Thanks. I will try to save the data without any problems.

psh0691 commented 2 years ago

@paar13kr To assign a full data limit, one more notary must approve it. Please don't let it be an example of DataCap abuse. Also, please note that self-dealing that occurs when SP and client are the same can be controversial.

paar13kr commented 2 years ago

@psh0691 Yes, I'll keep that in mind. Is it okay to seal my node with textile and estaury and evergreen in order to study about data cap in the future? This is to test the storage deal and retrive deal.

psh0691 commented 2 years ago

@paar13kr https://github.com/filecoin-project/filecoin-plus-client-onboarding

According to the instructions in the link above, the data cap is designated as at least four SPs, but there is no clear regulation that the client should store on its own node if it is the same as the SP. According to the Fileplus diagram, the notary is related to the client only and not to the SP. Therefore, you can decide on your own whether you want to save it on your own node.

However, I believe that misunderstandings should be minimized in order to be assigned additional DataCap.

MetaWaveInfo commented 2 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacealh7235xr3kmdlbbyekb5r3advcwtyhqonuhwui4aqqstlkn7dcw

Address

f16aqr37qeoymlc6m7bc7gfjjgawqorbz7l7avlrq

Datacap Allocated

50.00TiB

Signer Address

f1ktlkcxnmzxcdaoqfsunrg3vocfbmgv4n3mrn74a

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacealh7235xr3kmdlbbyekb5r3advcwtyhqonuhwui4aqqstlkn7dcw

large-datacap-requests[bot] commented 2 years ago

DataCap Allocation requested

Request number 2

Multisig Notary address

f01858410

Client address

f16aqr37qeoymlc6m7bc7gfjjgawqorbz7l7avlrq

DataCap allocation requested

100TiB

large-datacap-requests[bot] commented 2 years ago

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f16aqr37qeoymlc6m7bc7gfjjgawqorbz7l7avlrq

Last two approvers

MetaWaveInfo & psh0691

Rule to calculate the allocation request amount

100% of weekly dc amount requested

DataCap allocation requested

100TiB

Total DataCap granted for client so far

32GiB

Datacap to be granted to reach the total amount requested by the client (5Pib)

4.99PiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
0 0 50TiB 0 32GiB
BDE-io commented 2 years ago

@paar13kr Hi! Great to see you have gotten approval for DataCap. If you are looking for more storage providers to store these data or have any questions, please visit #bigdata-exchange on Filecoin Slack or reply here.

We have strong demand from a diverse group of SPs, who are actively looking to onboard more data.

UnionLabs2020 commented 2 years ago

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceduxzdewdiahwskhrdxb35r2nhvccadlylw7vrjsgf24j574bbf5w

Address

f16aqr37qeoymlc6m7bc7gfjjgawqorbz7l7avlrq

Datacap Allocated

100.00TiB

Signer Address

f17xdri3wunqgld7dm23e4f3eqsntjakwc47xjo6i

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceduxzdewdiahwskhrdxb35r2nhvccadlylw7vrjsgf24j574bbf5w

MetaWaveInfo commented 2 years ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacebquxbftkacgb3rnw22izj3qz74p2agobdkwnylnzffuihfiel2ck

Address

f16aqr37qeoymlc6m7bc7gfjjgawqorbz7l7avlrq

Datacap Allocated

100.00TiB

Signer Address

f1ktlkcxnmzxcdaoqfsunrg3vocfbmgv4n3mrn74a

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebquxbftkacgb3rnw22izj3qz74p2agobdkwnylnzffuihfiel2ck

Sunnyiscoming commented 1 year ago

Are there any problems with using datacap?

paar13kr commented 1 year ago

@Sunnyiscoming It's taking some time to organize the data. I think it will be available from April.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!