filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

Please Delete #17

Closed yuntaozhu closed 3 years ago

yuntaozhu commented 3 years ago

Large Dataset Notary Application

To apply for a DataCap allocation for your dataset, please fill out the following information.

Core Information

Please respond to the questions below in pargraph form, replacing the text saying "Please answer here". Include as much detail as you can in your answer!

Project details

Share a brief history of your project and organization.

Our Organization:
Oxyak is a technology-oriented service company, founded in 2016 and currently located in Shanghai, China.

Our main business is to provide professional software development for enterprise clients to help them complete their digital transformation. 

Our services can be mainly grouped into three main categories.

1、Enterprise digital capacity building
2、Construction of asset-light software services
3、Building intelligent enterprise software services

We have a solid understanding of data and both the dilemmas and opportunities in business operations. In the process of continuous practice, we find the IT solutions that will help businesses to grow sustainably. Taking business at the core and customer experience at the root, we give life to enterprise and intelligence to data.

To date, we have provided services to nearly 500+ local and multinational companies.

The following are representative of the clients we serve, and you can find them on our official website.


In the digital era, digital transformation of enterprises is an inevitable result as well as an important weapon for enterprises to establish differentiated competitiveness.

It is not a nagging hesitant option, but a need for inclusion in an emerging digital world.

Our Project:
The Filecoin DataCap application was submitted by our blockchain technology department and our big data business department.

This project is briefly summarized as a desire to use the Filecoin network to store files that have been analyzed for subdata to provide backup storage space, thereby reducing costs on the enterprise side.

We initiated this project because some of our clients aree-commerce business companies who broadcast live on major domestic media platforms, like Taobao, Douyin,One Smile,XiGua, Xiaohongshuand other well-known platforms. 

In addition to providing them with sales management software design and development services, we will also analyze the live video and sales in a multi-dimensional way by means of big data, so as to provide guidance for the optimization of the client's e-commerce business.

Once the analysis of the video files is complete, we typically retain them for 30 days for our clients and then delete them from our cloud servers. Many times we have had clients who wanted to retrieve files that were months old, but unfortunately we had to delete them due to our cost considerations.

In order to carry out our corporate mission of empowering enterprises with intelligence and life to data.

our blockchain technology department collaborates with our big data business  department to find solutions.

We have studied other blockchain stores, SWARM, CRUST, Arweave, and ultimately concluded that Filecoin is a reasonable choice in terms of long-term stability and implementability.

What is the primary source of funding for this project?

The main funding for this program comes from the company's R&D funds, and every year we invest heavily in technology development, which is unavoidable in order to maintain a continuous technological lead.

What other projects/ecosystem stakeholders is this project associated with?


Use-case details

Describe the data being stored onto Filecoin

As I expressed in the plan description, the data being stored is the live e-commerce video from the customer and is analyzed by our system.Then make a backup in the Filecoin network.

Where was the data in this dataset sourced from?

This looks like a bit of a duplicate of the previous problem.

the data being stored is the live e-commerce video from the customer and is analyzed by our system.Then make a backup in the Filecoin network.

Can you share a sample of what is in the dataset? A link to a file, an image, a table, etc., are good examples of this.

We have packaged and uploaded a small portion of the video to Github, which can be viewed at the following link


Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

On this point, we have confirmed with our client that the data is primarily advertising content for the sale of goods and can be made public, but the ultimate ownership of the data remains unchanged.

What is the expected retrieval frequency for this data?

This is actually difficult to answer, I can't give you an exact number at the moment, because these data are originally stored on our servers, and customers may retrieve them 2 to 3 times within 30 days, and they may still retrieve them from us after a few months.

But once this data is made public in the Filecoin network, there may be other users who will be able to access it.

For how long do you plan to keep this dataset stored on Filecoin? Will this be a permanent archival or a one-time storage deal?

We may  determine this based on the agreement we have with the client. I can only say that both of the 2 scenarios can happen.

DataCap allocation plan

In which geographies do you plan on making storage deals?

The first choice is still Greater China, both in terms of trust and data transfer costs.

Of course, given the unique situation in China, we may also consider regions other than the mainland, but the feasibility and cost of transferring data across countries through the Filecoin network is not quite clear at this point.

In the short term, we still choose the Greater China.

What is your expected data onboarding rate? How many deals can you make in a day, in a week? How much DataCap do you plan on using per day, per week?

When we acquire DataCap, we will establish a partnership with miners to develop a data storage system. This may take a little time.

When the project is on track, it may transfer about 4.5T-5T of data from our cloud server to the miner's server every day, which is also reasonable in terms of server cost.

So it is
4.5-5T per day
About 25T per week(Working day)

How will you be distributing your data to miners? Is there an offline data transfer process?

I will tell him the content of these data truthfully, there is no need to hide this.

Early we might try to use offline transactions, but in the long term we would like these steps to be done online through the system.

How do you plan on choosing the miners with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

This is really a critical point, we will examine the miner's overall situation, the area where it is located, the grade of the server room, the performance of the hardware, the technical strength and other dimensions.

How will you be distributing data and DataCap across miners storing data?

When we get DataCap, early on we would keep the ratio of 1:1 to allocate DataCap and data. When the business matures, we prefer to allocate in a 1:2 ratio, after all, DataCap is limited.

When this project is able to generate commercial value, we will probably become less dependent on DataCap, miners will get direct rewards from customers and DataCap will be more like a side gift.
large-datacap-requests[bot] commented 3 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

large-datacap-requests[bot] commented 3 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

dannyob commented 3 years ago

Filecoin Foundation approves this.

yuntaozhu commented 3 years ago

That's great, what should we do next?

yuntaozhu commented 3 years ago

Sorry, we have to quit.

Our attorneys believe there are legal risks.

large-datacap-requests[bot] commented 3 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.