filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
109 stars 62 forks source link

[DataCap Application]LDN Blue Storm Information Technology #323

Closed huangqian2021 closed 1 year ago

huangqian2021 commented 2 years ago

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

Hunan Bluestorm Information Technology Co., Ltd. is a supplier focusing on the application field of "Internet + public security" industry and providing comprehensive service solutions. Based on big data + AI technology , the company has created intelligent and innovative police application platform and mobile application.
The company is the joint development and practice base of intelligent police mobile application of Hunan Police College. It has maintained close cooperation with public institutions and government agencies in nearly 50% of cities in Hunan Province ,which covers 7 cities with a land area of 100,000 km² and a population of 30 million,to develop smart public security platform.The platform is a public security command system based on police comprehensive platform, information research and judgment platform, geographic information platform and big data center, supplemented by video surveillance system, GPS satellite positioning system and image transmission system, and supported by mobile alarm positioning, police service management and modern communication technology.
The company provides research and development of multiple Internet products such as smart community and smart joint defense based on "mobile Internet + industry", and starts to establish a security service product system with "Internet +" as the core to provide service information guarantee for people's safety.
Now,we have a new Smart Community Program, a tool to assist government agencies in social governance, is being piloted in a city. With 6000 WeChat working groups which has the maximum number of people is 500,there is about 50Tib data growth every day. Community administrators listen to the voices, suggestions, feedback and even complaints of the people in the group, and report the information to the relevant higher-level units for quick response and processing, so as to improve the work efficiency of government departments and maintain social security and stability. 
With the rapid growth of data, the efficiency of manual processing of information will be greatly reduced. Therefore, it is necessary to analyze and identify the information through technical means. Currently, our technology is flawed and can only be analyzed and positioned by keywords. We need to save as much basic data as possible for subsequent learning and improve our ability to analyze and process pictures, videos and audios. 
These data will provide support for modeling in the future. Once the project is stable, it will be rolled out nationwide, and the data capacity will be immeasurable.

What is the primary source of funding for this project?

company funds.

What other projects/ecosystem stakeholders is this project associated with?

NONE.

Use-case details

Describe the data being stored onto Filecoin

They are mainly  videos,pictures,and working records.
Now there are 2P data stored in Ali Cloud and others are stored in our own platform. The data will increase nearly 250Tib every day.At present,it costs thousands per month especially when downloading.And the storage time is short,every two weeks the data need to be updated. But we would like to keep the data as long as possible and as much as possible.

Where was the data in this dataset sourced from?

The data comes from video surveillance and wechat working group.The data from the monitoring system is mainly geographic information.The wechat data is captured by an automatic capturing software regardless of the file type and then sent to our own platform.

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

[103.119.3.149:80/Data_sample1.mp4](url)
[103.119.3.149:80/Data_sample2.jpg](url)
[103.119.3.149:80/Data_sample3.jpg](url)
[103.119.3.149:80/Data_sample4.mp4](url)
[103.119.3.149:80/Data_sample5.jpg](url)

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

yes, the dataset is public.

What is the expected retrieval frequency for this data?

the data could be accessed by anyone.Data related to policing and privacy will not be stored in Filecoin at this stage.

For how long do you plan to keep this dataset stored on Filecoin?

At least 3 year.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

Asia and Greater China.

How will you be distributing your data to storage providers? Is there an offline data transfer process?

We support both online and offline .Generally, we choose offline transfer.

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

Based on storage provider's capacity, reputation and performance. We need long-term cooperation with storage providers that has strong comprehensive power.The storage providers that are located in China and have much experience in dealing with verified data should be given priority.

How will you be distributing deals across storage providers?

We will follow the rules for large-datasets, each SP will be distributed deals no more than 20% of all.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

We are ready to start making deals.
aggregation-and-compliance-bot[bot] commented 11 months ago
Client f01921336 does not follow the datacap usage rules. More info here. This application has been failing the requirements for 7 days. Please take appropiate action to fix the following DataCap usage problems. Criteria Treshold Reason
Cid Checker score > 25% The client has a CID checker score of 0%. This should be greater than 25%. To find out more about CID checker score please look at this issue: https://github.com/filecoin-project/notary-governance/issues/986
Shared data percent < 20% 25.01% of the clients data is shared with other clients. This should be less than 20%