filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application] Shenzhen Liberty City Culture Media Co., Ltd. #1495

Closed 001can closed 1 year ago

001can commented 1 year ago

Data Owner Name

Shenzhen Liberty City Culture Media Co., Ltd.

Data Owner Country/Region

China

Data Owner Industry

Information, Media & Telecommunications

Website

http://www.whballoon.com/

Social Media

http://www.whballoon.com/

Total amount of DataCap being requested

5PiB

Weekly allocation of DataCap requested

500TiB

On-chain address for first allocation

f1mrqexd4z6oxrvzeuthlf4n2jf6m64abonhhchya

Custom multisig

Identifier

No response

Share a brief history of your project and organization

Liberty City Culture Media Co., Ltd. is a cultural product provider with market insight as the starting point, value communication as the core, and unique creativity as the main competitiveness.
Focusing on video shooting and production, it covers brand commercials, product commercials, corporate promotional videos and other types.

Is this project associated with other projects/ecosystem stakeholders?

No

If answered yes, what are the other projects/ecosystem stakeholders

No response

Describe the data being stored onto Filecoin

Various types of original materials such as brand commercials, product commercials, corporate promotional videos, etc.

Where was the data currently stored in this dataset sourced from

My Own Storage Infra

If you answered "Other" in the previous question, enter the details here

No response

How do you plan to prepare the dataset

IPFS, lotus

If you answered "other/custom tool" in the previous question, enter the details here

No response

Please share a sample of the data

https://www.aliyundrive.com/s/SnnuBDbrnYA
https://www.aliyundrive.com/s/9UD9jQN56c9
https://www.aliyundrive.com/s/T2xdLn9NsK5

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

Yearly

For how long do you plan to keep this dataset stored on Filecoin

More than 3 years

In which geographies do you plan on making storage deals

Greater China, Asia other than Greater China

How will you be distributing your data to storage providers

HTTP or FTP server, Shipping hard drives

How do you plan to choose storage providers

Slack

If you answered "Others" in the previous question, what is the tool or platform you plan to use

No response

If you already have a list of storage providers to work with, fill out their names and provider IDs below

No response

How do you plan to make deals to your storage providers

Lotus client

If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline

Yes

large-datacap-requests[bot] commented 1 year ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
large-datacap-requests[bot] commented 1 year ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

Sunnyiscoming commented 1 year ago

How much data do you have? How many copies will you store? What's the relationship between you and the organization?

001can commented 1 year ago

hi,@Sunnyiscoming ,Happy new year! We just got back from New Years break and here are our replies

How much data do you have?

At present, we have about 800T of data, which are the material of some corporate promotional videos, product promotional videos and other videos, and as time goes by, we will have more materials to store.

How many copies will you store?

We want to store more than 5 copies.

What's the relationship between you and the organization?

I am responsible for the technical architecture of the company.

herrehesse commented 1 year ago

Dear Applicant,

Due to the increased amount of erroneous/wrong Filecoin+ data recently, on behalf of the entire community, we feel compelled to go deeper into datacap requests. Hereby to ensure that the overall value of the Filecoin network and Filecoin+ program increases and is not abused.

Please answer the questions below as comprehensively as possible.

Customer data

We expect that for the onboarding of customers with the scale of an LDN there would have been at least multiple email and perhaps several chat conversations preceding it. A single email with an agreement does not qualify here.

Should this only be soley for acquiring datacap this is of course out of the question. The customer must have a legitimate reason for wanting to use the Filecoin+ program which is intended as a program to store useful and public datasets on the network.

(As an intermediate solution Filecoin offers the FIL-E program or the glif.io website for business datasets that do not meet the requirements for a Filecoin+ dataset)

Files and Processing

Hopefully you understand the caution the overall community has for onboarding the wrong data. We understand the increased need for Filecoin+, however, we must not allow the program to be misused. Everything depends on a valuable and useful network, let's do our best to make this happen. Together.

001can commented 1 year ago
  • Could you demonstrate exactly how and to what extent customer contact occurred?
  • Did the customer specify the amount of data involved in this relevant correspondence?

We are the owners of the data. We shoot corporate videos and product promotional videos. We will generate a large amount of unedited raw data and materials. Currently, about 800T of data is stored on our local storage server.

  • Why does the customer in question want to use the Filecoin+ program?

Filecoin+ is very suitable for storing large data sets. Our video footage needs to be stored for a long time. We hope to prevent our data from being lost by storing it in the filecoin network through filecoin+.

  • Why is the customer data considered Filecoin+ eligible?

Our data complies with the requirements of filecoin+, our data can be made public and open, and the material shot by ourselves does not currently exist on the Internet.

  • Could you please demonstrate to us how you envision processing and transporting the customer data in question to any location for preparation?

Of course, our data is stored in our own server, and we will store the data in the sp by transporting the hard disk. I will upload a photo of our storage facility. 服务器3 服务器2

  • Would you demonstrate to us that the customer, the preparer and the intended storage providers all have adequate bandwidth to process the set with its corresponding size?

We have not found a suitable SP at present, and we are going to find a suitable SP to store data through slack.

  • Would you tell us how the data set preparer takes into account the prevention of duplicates in order to prevent data cap abuse?

We will periodically retrieve (at least once a month) whether our data is stored normally, and use the filplus-checker robot to check whether the sp meets the requirements for storing our data,; such as storage provider should not exceed 25% of total datacap. Storage provider should not be storing duplicate data for more than 20%. storage provider should have published its public IP address. All storage providers should be located in different regions.

cryptowhizzard commented 1 year ago

Hello @001can

Are you a direct employee of Shenzhen Liberty City Culture Media Co., Ltd. because you state that you are responsible for the technical architecture of the company?

Does Shenzhen Liberty City Culture Media Co., Ltd. have enough resources to pack all the data into distributable files to make storage deals with and what software tools is Shenzhen Liberty City Culture Media Co., Ltd. going to use for this? In what timeframe do you want to distribute all the data and how?

Does Shenzhen Liberty City Culture Media Co., Ltd. have it's own Filecoin nodes / hardware operational and if so can you tell us what it consists of?

001can commented 1 year ago

Are you a direct employee of Shenzhen Liberty City Culture Media Co., Ltd. because you state that you are responsible for the technical architecture of the company?

Yes, I am the person responsible for the technical architecture of the company

Does Shenzhen Liberty City Culture Media Co., Ltd. have enough resources to pack all the data into distributable files to make storage deals with and what software tools is Shenzhen Liberty City Culture Media Co., Ltd. going to use for this? In what timeframe do you want to distribute all the data and how?

We will use this tool to package the data into a car file: https://github.com/tech-greedy/singularity I hope to find suitable sp within 1-2 months, and then store the data in a suitable sp, because the amount of data is huge, we will use hard disk transportation to transfer data.

Does Shenzhen Liberty City Culture Media Co., Ltd. have it's own Filecoin nodes / hardware operational and if so can you tell us what it consists of?

Sorry, currently we do not operate storage nodes ourselves

cryptowhizzard commented 1 year ago

Thanks.

As the website is looking very nice i have tried to find a company registration number but i could not find anything. Google could only locate 2 results on this company over the period of years which is rather strange.

https://www.google.com/search?q=%22Liberty+City+Culture+Media%22&sxsrf=ALiCzsbrXXGOD2ohMWEKmkB92XWwxLRHMg%3A1672926656590&ei=wNW2Y7vZI8KcsAfqpoqYBQ&ved=0ahUKEwi7uee4ybD8AhVCDuwKHWqTAlMQ4dUDCA8&uact=5&oq=%22Liberty+City+Culture+Media%22&gs_lcp=Cgxnd3Mtd2l6LXNlcnAQAzIFCCEQoAEyBQghEKABMgUIIRCgAToECCMQJ0oECEEYAEoECEYYAFAAWMAQYNwRaABwAXgAgAFxiAHXAZIBAzIuMZgBAKABAcABAQ&sclient=gws-wiz-serp

Scherm­afbeelding 2023-01-05 om 14 52 19

Can you provide us with the company registration number of this Limited company ( LTD ) Shenzhen Liberty City Culture Media Co. Ltd. ?

cryptowhizzard commented 1 year ago

Looking at the internet archive i located your website www.whballoon.com 3 times. Between 2018 and 2021 it looked like this:

Scherm­afbeelding 2023-01-05 om 14 58 48 Scherm­afbeelding 2023-01-05 om 14 59 17

herrehesse commented 1 year ago

@001can You describe the content as:

"Various types of original materials such as brand commercials, product commercials, corporate promotional videos, etc."

These types of request should be done through FIL-E. Filecoin+ LDN is not the right place.

001can commented 1 year ago

Thanks.

As the website is looking very nice i have tried to find a company registration number but i could not find anything. Google could only locate 2 results on this company over the period of years which is rather strange.

https://www.google.com/search?q=%22Liberty+City+Culture+Media%22&sxsrf=ALiCzsbrXXGOD2ohMWEKmkB92XWwxLRHMg%3A1672926656590&ei=wNW2Y7vZI8KcsAfqpoqYBQ&ved=0ahUKEwi7uee4ybD8AhVCDuwKHWqTAlMQ4dUDCA8&uact=5&oq=%22Liberty+City+Culture+Media%22&gs_lcp=Cgxnd3Mtd2l6LXNlcnAQAzIFCCEQoAEyBQghEKABMgUIIRCgAToECCMQJ0oECEEYAEoECEYYAFAAWMAQYNwRaABwAXgAgAFxiAHXAZIBAzIuMZgBAKABAcABAQ&sclient=gws-wiz-serp

Scherm­afbeelding 2023-01-05 om 14 52 19

Can you provide us with the company registration number of this Limited company ( LTD ) Shenzhen Liberty City Culture Media Co. Ltd. ?

Hi, friend, we are a company located in China. You should not be able to find our information on Google’s English network. This is our registration number: 91440300MA5HFXJ55W. You can check it through https://www.tianyancha.com/

001can commented 1 year ago

Looking at the internet archive i located your website www.whballoon.com 3 times. Between 2018 and 2021 it looked like this:

Scherm­afbeelding 2023-01-05 om 14 58 48 Scherm­afbeelding 2023-01-05 om 14 59 17

What website was built before this domain name has nothing to do with us. We created the website in the first half of 2022.

001can commented 1 year ago

@001can You describe the content as:

"Various types of original materials such as brand commercials, product commercials, corporate promotional videos, etc."

These types of request should be done through FIL-E. Filecoin+ LDN is not the right place.

No, my friend, we don't need to store the data encrypted by fil-e, what we store is our open data, which helps to share and communicate with more peers.

cryptowhizzard commented 1 year ago

Looking at the internet archive i located your website www.whballoon.com 3 times. Between 2018 and 2021 it looked like this: Scherm­afbeelding 2023-01-05 om 14 58 48 Scherm­afbeelding 2023-01-05 om 14 59 17

What website was built before this domain name has nothing to do with us. We created the website in the first half of 2022.

This is not how it works. When you are the new owner of this website the who-is record in the domain registry is updated accordingly to the new owner. This record has not been updated since 2015.

https://who.is/whois/whballoon.com

LTD company’s are from the United Kingdom. LCC’s are from China. I highly doubt this request is legitimate.

cryptowhizzard commented 1 year ago

@001can You describe the content as: "Various types of original materials such as brand commercials, product commercials, corporate promotional videos, etc." These types of request should be done through FIL-E. Filecoin+ LDN is not the right place.

No, my friend, we don't need to store the data encrypted by fil-e, what we store is our open data, which helps to share and communicate with more peers.

Supposingly this request is legit commercial data should be stored in Fil-E (enterprise ) meant for businesses storing large datasets.

001can commented 1 year ago

Sorry, this is a translation error. In order to facilitate understanding, we translated the company name into English through Google. If you open the official website, you will see that the correct company name is "深圳市自由都市文化传媒有限公司"

001can commented 1 year ago

@001can You describe the content as: "Various types of original materials such as brand commercials, product commercials, corporate promotional videos, etc." These types of request should be done through FIL-E. Filecoin+ LDN is not the right place.

No, my friend, we don't need to store the data encrypted by fil-e, what we store is our open data, which helps to share and communicate with more peers.

Supposingly this request is legit commercial data should be stored in Fil-E (enterprise ) meant for businesses storing large datasets.

The edited video will be delivered to customers as commercial data, and we currently hope to store the original data before editing.

lvschouwen commented 1 year ago

@001can All 3 links provided as sample data cannot be opened without a account and hence the data cannot be verified. Can you provide use with links that can publicly opened?

hyunmoon commented 1 year ago

Nope. This kind of dataset should be stored as unverified deals.

herrehesse commented 1 year ago

Agreed with @hyunmoon here.

001can commented 1 year ago

what is unverified deals?

herrehesse commented 1 year ago

@001can Regular deals on the Filecoin network, most storage providers take them for about 1/100th of the price of AWS right now.

001can commented 1 year ago

Is there any documentation that can help me understand?

herrehesse commented 1 year ago

Ofcourse friend, here you go: https://filecoin.io/blog/posts/how-storage-and-retrieval-deals-work-on-filecoin/ Come back to me when you have follow-up questions.

hyunmoon commented 1 year ago

https://spec.filecoin.io/#section-systems.filecoin_blockchain.storage_power_consensus.on-power

001can commented 1 year ago

This doesn't seem to clear my doubts, I mean, what's the difference between this and filecoin+

hyunmoon commented 1 year ago

https://filplus.storage/faq

herrehesse commented 1 year ago

Filecoin+ is meant for datasets that are public, open, and mission aligned with Filecoin's "Storing humanities most important information".

Filecoin itself can be used to store all types of data, but you won't get datacap for these files.

001can commented 1 year ago

@001can Regular deals on the Filecoin network, most storage providers take them for about 1/100th of the price of AWS right now.

We don't want to sell our data, it's wrong

001can commented 1 year ago

Filecoin+ is meant for datasets that are public, open, and mission aligned with Filecoin's "Storing humanities most important information".

Filecoin itself can be used to store all types of data, but you won't get datacap for these files.

This seems to be back to the previous question, I don't pass filecoin+, how to ensure that sp will store my data correctly without the supervision of the community and robots?

hyunmoon commented 1 year ago

Simply put, what we're saying is you should pay the storage providers directly to get your data stored just as you would do with any traditional cloud storage providers. Because verified deals provide x10 quality adjusted power, verified deals are accepted for free (or at a premium) by the storgae providers. The x10 quality adjusted power is given as a reward for dataset that will incrase the value of the filecoin network. Your dataset is not considered one of those that's why your dataset should be stored as unverified (or regular) deals.

001can commented 1 year ago

Simply put, what we're saying is you should pay the storage providers directly to get your data stored just as you would do with any traditional cloud storage providers. Because verified deals provide x10 quality adjusted power, verified deals are accepted for free (or at a premium) by the storgae providers. The x10 quality adjusted power is given as a reward for dataset that will incrase the value of the filecoin network. Your dataset is not considered one of those that's why your dataset should be stored as unverified (or regular) deals.

I wonder who judged that our data is not one of them? There is an "Information, Media & Telecommunications" option in the LDN application, and I believe I meet the requirements. Then if you understand it this way, it seems that 99% of LDNs do not meet the requirements?

herrehesse commented 1 year ago

@001can This is indeed our opinion too.

hyunmoon commented 1 year ago

Well, lucky for them and unlucky for you. It is up to the elected notaries to judege if your dataset applies to FIL+. Because some people got away with the rule doesn't mean everyone should be free from the rule. https://github.com/filecoin-project/notary-governance/tree/main/notaries

herrehesse commented 1 year ago

@001can It is a three-part dilemma.

On the one hand, a lot of data needs to be put on Filecoin to show the world how useful we are. The more the better.

On the other hand, there is extremely much data that has absolutely no value and can be easily uploaded or faked. ( I have seen many spoofed / empty sectors - referring to 15P raw from slingshot)

And lastly, many users want to abuse the Filecoin+ program because of the data cap and the rewards involved.

001can commented 1 year ago

I read https://github.com/filecoin-project/FIPs/blob/master/FIPS/fip-0003.md
https://github.com/filecoin-project/notary-governance/tree/main/notaries https://github.com/filecoin-project/filecoin-plus-large-datasets, the vision for filecoin+ is described as "the dataset should be public, open, and mission aligned with Filecoin and Filecoin Plus . This also means that the data should be accessible to anyone in the network, without requiring any special permissions or access requirement stored data should be readily retrievable on the network and this should be regularly verified (though the use of manual or automated verification that includes retrieving data from various miners over the course of the DataCap allocation timeframe";

and such "Notary Sets Use Cases Support (1) or more the following, which can be linked to in a separate doc: Professional Services (Hosting Reseller, Long-term Backups, Data Warehousing) Developer Tools (Package Managers, Automatic Notaries, Web2 to Web3 integrations) Decentralized Applications User Content (Personal Storage) Public or Open Data (Scientific datasets, research data, government or historical Media & Entertainment (Videos, photos, NFTs)"; and "The ultimate goal of Filecoin Plus is to accelerate the proliferation of products and compliant use cases on Filecoin with both geographic and use case diversity. Filecoin Plus provides the framework and incentive for business development to happen on Filecoin. After all, adoption is more than just technology. Client needs and legitimate use cases will shape the goods and services produced on Filecoin.”

What's wrong with our data?

lvschouwen commented 1 year ago

@001can You seem to be more interested in fighting @herrehesse then fixing your application and providing the community with sample data that can be checked. I therefor advice all notaries to not sign this application as is until this is fixed and this childish trowing mud stops.

cryptowhizzard commented 1 year ago

I read https://github.com/filecoin-project/FIPs/blob/master/FIPS/fip-0003.md https://github.com/filecoin-project/notary-governance/tree/main/notaries https://github.com/filecoin-project/filecoin-plus-large-datasets, the vision for filecoin+ is described as "the dataset should be public, open, and mission aligned with Filecoin and Filecoin Plus . This also means that the data should be accessible to anyone in the network, without requiring any special permissions or access requirement stored data should be readily retrievable on the network and this should be regularly verified (though the use of manual or automated verification that includes retrieving data from various miners over the course of the DataCap allocation timeframe";

and such "Notary Sets Use Cases Support (1) or more the following, which can be linked to in a separate doc: Professional Services (Hosting Reseller, Long-term Backups, Data Warehousing) Developer Tools (Package Managers, Automatic Notaries, Web2 to Web3 integrations) Decentralized Applications User Content (Personal Storage) Public or Open Data (Scientific datasets, research data, government or historical Media & Entertainment (Videos, photos, NFTs)"; and "The ultimate goal of Filecoin Plus is to accelerate the proliferation of products and compliant use cases on Filecoin with both geographic and use case diversity. Filecoin Plus provides the framework and incentive for business development to happen on Filecoin. After all, adoption is more than just technology. Client needs and legitimate use cases will shape the goods and services produced on Filecoin.”

What's wrong with our data?

You are confusing notary use cases ( a usecase the notary specialises in and given in his specific application to become a notary ) and the fil+ program. Notary’s also help with Fil-E .. this is where these fields of experise come in handy for a customer to select a notary for his business case.

001can commented 1 year ago

@001can You seem to be more interested in fighting @herrehesse then fixing your application and providing the community with sample data that can be checked. I therefor advice all notaries to not sign this application as is until this is fixed and this childish trowing mud stops.

I'm uploading data to google; I'm frustrated with your irresponsible accusations, why can't I question someone else's problem? And in a situation where I feel very irresponsible; I reply to all questions.

001can commented 1 year ago

Notary Sets Use Cases

Let's exclude "Notary Sets Use Cases", other descriptions seem to match

001can commented 1 year ago

@001can You seem to be more interested in fighting @herrehesse then fixing your application and providing the community with sample data that can be checked. I therefor advice all notaries to not sign this application as is until this is fixed and this childish trowing mud stops.

we have uploaded to google https://drive.google.com/drive/folders/1yuyTeAYMxkFT8NUWrlC8OGIvcq50UcXg?usp=share_link https://drive.google.com/drive/folders/1CTMiudYzkMmOPfclboJX9neBenN52Gx9?usp=share_link https://drive.google.com/drive/folders/1ondfIIPsWpbKH_2WJL2paNUEahI2CEt5?usp=share_link

dkkapur commented 1 year ago

Pasted over from Slack:

The LDN process is to support open/public datasets. The E-Fil+ process is very similar to the LDN process, but is designed to better support cases pr private/encrypted data and cases where the data owner is not directly involved. Your responses in GitHub here do state that the data will not be encrypted, and will be open/accessible to all. The risk here then is the SPs you actually end up working with need to actually support that. I think if you can end up finding notaries that do support your application based on this, I think LDN is a fair path for you to go down.

Notaries above @hyunmoon and @cryptowhizzard have found reasons to highlight that your application is risky and do not want to support it. Addressing their concerns, as you have been, is a good way to go. You can also request for additional notaries to get more eyes on this in Slack channel #fil-plus-application-review, or come to the next governance call.

001can commented 1 year ago

Pasted over from Slack:

The LDN process is to support open/public datasets. The E-Fil+ process is very similar to the LDN process, but is designed to better support cases pr private/encrypted data and cases where the data owner is not directly involved. Your responses in GitHub here do state that the data will not be encrypted, and will be open/accessible to all. The risk here then is the SPs you actually end up working with need to actually support that. I think if you can end up finding notaries that do support your application based on this, I think LDN is a fair path for you to go down.

Notaries above @hyunmoon and @cryptowhizzard have found reasons to highlight that your application is risky and do not want to support it. Addressing their concerns, as you have been, is a good way to go. You can also request for additional notaries to get more eyes on this in Slack channel #fil-plus-application-review, or come to the next governance call.

hi,@dkkapur,I have verified and answered all their doubts about my authenticity in detail, and they have not raised any objections to this point so far. Do you mean that I can find another notary to re-examine me?

simonkim0515 commented 1 year ago

Datacap Request Trigger

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

500TiB

Client address

f1mrqexd4z6oxrvzeuthlf4n2jf6m64abonhhchya

large-datacap-requests[bot] commented 1 year ago

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f1mrqexd4z6oxrvzeuthlf4n2jf6m64abonhhchya

DataCap allocation requested

250TiB

Id

503b4627-102e-4591-a204-4c44a0ed4c5a

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report[^1]

There is no previous allocation for this issue.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

zcfil commented 1 year ago

After reading history, I have some questions:

  1. Can you show more data?
  2. Is your data publicly available?
001can commented 1 year ago

After reading history, I have some questions:

  1. Can you show more data?
  2. Is your data publicly available?

Thank you for your reply.

  1. Of course you can, you can check this link
  2. Our data is publicly available.
zcfil commented 1 year ago

@001can I clicked on the link and found that there was only a small amount of video data. Can you provide more data samples?