livepeer / grants

⚠️ DEPRECATED ⚠️ Please visit the new homepage at https://grants.livepeer.org
43 stars 7 forks source link

Near Duplicate Video Detection / Video Based Fraud Detector #92

Closed ndujar closed 11 months ago

ndujar commented 1 year ago

Give a 3 sentence description for your proposal.

Near-Duplicate Video Detection (NDVD) and Near-Duplicate Video Retrieval are hot research topics of increasing interest in recent years due to the exponential growth of online video creation. It is considered essential in a variety of applications that involve copy detection, copyright protection, video retrieval, indexing and management and video recommendation and search.

The aim of this proposal is to research the specific application of copyright protection and to explore the potential integration within the Livepeer ecosystem.

Describe the problem you are solving.

Recent developments in the Livepeer Protocol include the possibility of accessing an MPEG-7 perceptual hashed version of the transcoded videos.

This basically opens the door to several applications which are worth exploring.

These include [SHEN et al. 2020]:

Describe the solution you are proposing and how it will have a positive impact on the Livepeer developer ecosystem.

Even though there is a large corpus of work and research on the topic of NDVR / NDVD, the scope of this research would initially focus only on the application of the recent developments included in Livepeer's protocol (see MPEG-7 perceptual hashing).

Essentially, this makes it possible to access a perceptual hashed version of any video broadcasted through the Livepeer Network.

More specifically, we would be dealing with the particular application of Near Duplicate Video Detection.

According to the literature, the extraction speed of the Video Signature descriptor from uncompressed video is ~900 frames/sec on a standard PC (Intel Xeon X5460 (single core implementation), running at 3.16GHz and with 8GB of RAM). With this performance, researchers also claim that duplicate detection by means of the MPEG-7 Video Signature achieved an average success rate of 95.49% with a false alarm rate no more that 5ppm, i.e. with a precision≈1 S. Paschalakis et al, 2012.

When it comes to integration into Livepeer ecosystem, it is important to remark the following considerations:

The approach that we will be evaluating for the Livepeer use case necessarily involves using an open dataset. This makes the fingerprinting algorithm mentioned above public, which implies that once the dataset and algorithm are known it is much easier to construct media variants that circumvent scanning. For this reason, historically, perceptual hash algorithm developers have been more secretive about how things actually work under the hood (see PhotoDNA, Cloudflare CSAM scanning, Apple NeuralHash).

Moreover, there is the need to account for assets extracted from other platforms where the access to their perceptual hash might not be possible, hence rendering them "invisible" to the detector.

Describe why you are the right team with the capability to build this.

I can apply my past expertise in the field of Machine Learning and Computer Vision applied to video streaming, once this problem is well understood. Besides, I have some previous experience with Livepeer as one of the contributors of the [transcoding verifier] (https://github.com/livepeer/verification-classifier).

Describe the scope of the project including a 3 month timeline and milestones.

Milestone 1: MPEG-7 Video Signature descriptor evaluation (expected completion: Nov 22)

Milestone 2: MPEG-7 Video Signature vulnerability analysis (expected completion: Mid Nov 22)

Milestone 3: MPEG-7 Video Signature descriptor alternatives (expected completion: Dec 22)

Milestone 4: Livepeer's Video Copy detector integration (expected completion: Jan 23)

NOTE: Milestones are approximate and to be reviewed and adjusted accordingly with updated time, costs and scope at the end of the previous. Computational costs for creation of the test / train dataset need to be considered apart.

Please estimate hours spent on project based on the above and how much funding you will need.

Assuming researchers work part time (4 hours per day) the research and divulging content creation should be completed within 9 weeks (180h in total). Implementation and integration is expected to take 3 weeks (60h).

Funding needed: USD13200 + Cloud costs

nelsorya commented 1 year ago

Hey @ndujar, thanks for putting this together. This is a very well-put-together proposal and we can see the value this area of work could bring to the Livepeer Network. After going over this proposal we would love to fund this.

Could you provide a rough estimate for what the cloud costs would look like?

ndujar commented 1 year ago

Hi @nelsorya,

This is great news. I am very happy to know about your decision :100:

Could you provide a rough estimate for what the cloud costs would look like?

In the most expensive case we can expect $200-$300 /mo, but we might be more in the $40-$70 /mo, depending on how much we want to accelerate the achievement of quantitative results.

Nevertheless, in principle, most of the work should be possible to develop locally, so the cloud cost goes to $0 :)

nelsorya commented 1 year ago

Ok great @ndujar, thanks for this. That sounds reasonable. Could you reach out to me on discord and we can create a group chat to coordinate next steps. I'm Nelson#6080 on discord

github-actions[bot] commented 11 months ago

This issue has been marked as stale with no activity. It will close in 7 days.

github-actions[bot] commented 11 months ago

This issue has been automatically closed.