[Video Disruptors Grant]:AI extended Livepeer SDK( player and broadcast component): Live closed captions, subtitles,AI moderation and AI generated virtual backgrounds

scapula07 commented 8 months ago

Please describe your project. Start with the need or problem you are trying to solve with this project. Describe why your solution is going to adequately solve this problem.

Challenge:

In today's dynamic media landscape, live streaming platforms are in fierce competition for viewer engagement. Real-time global access has revolutionized content creation, yet evolving viewer and creator expectations persist. The absence of advanced AI features, including live closed captions, subtitles, AI moderation, and AI-generated virtual backgrounds, poses challenges that impact viewer experience, accessibility, content quality, and safeguarding children from inappropriate content. These challenges hinders Livepeer competitiveness and inclusivity, affecting diverse audiences, such as the hearing impaired, multilingual viewers, and content creators seeking unique effects.

Solution:

To address this challenge, we propose the development and and seamless integration of real-time AI-generated closed captions, subtitles, and content moderation into the Livepeer platform, ensuring accessibility for diverse audiences and a safer streaming environment. Advanced speech recognition and AI-driven language processing enhance the user experience, particularly benefiting the hearing-impaired and multilingual viewers. Automatic content moderation, powered by machine learning, swiftly identifies and removes inappropriate material, upholding a respectful streaming environment.

Additionally, we'll implement into the stream from browser Broadcast component,AI-generated virtual backgrounds using computer vision and machine learning. This feature empowers content creators to immerse viewers in captivating virtual environments, fostering greater engagement and interaction.

Link to public GitHub repo (if applicable)

https://github.com/scapula07/Live-caption-Livepeer

Link to demo website (if applicable)

No response

Please describe in more detail why this proposal is valuable for the Livepeer ecosystem

Introduction:

Livepeer's entrance into the streaming realm has marked a monumental milestone for the web3 ecosystem. Its strategic design, deeply rooted in the optimization of efficiency, fortification of reliability, and and the reduction of operational costs for content creators, viewers, and developers, has significantly elevated its standing. As excitement continues to swell, drawing a diverse array of streamers and projects eager to embrace its capabilities, some paramount needs takes center stage: the imperative to sustain , elevate viewer engagement, protect viewers from harmful contents and streaming tooling. Overlooking these crucial aspects has the potential to trigger an exodus of streamers ,viewers and developers back to the web2 streaming landscape.

Challenges

The first challenge revolves around implementing AI-generated live captions and subtitles. This is all about making content more accessible and engaging. By providing real-time text support, the Livepeer can be more inclusive, reaching a broader audience. It also helps with search engine visibility. However, not addressing this challenge could mean compliance issues, reduced viewer retention, and higher costs for content creators. To stay competitive and inclusive, Livepeer must efficiently implement AI-generated live captions and subtitles.

The second challenge centers on the implementation of robust AI moderation features. In simpler terms, this means ensuring that the platform remains safe, compliant with regulations, and user-friendly. Imagine the task of keeping inappropriate content at bay, adhering to rules and guidelines, and making sure users have a positive experience. It's about scaling up to handle more users efficiently, maintaining content quality, protecting the platform's reputation, and curbing online harassment. Addressing these challenges is critical to providing a secure and competitive environment for both content creators and viewers.

The third challenge pertains to AI-generated virtual backgrounds. Think of this as providing a toolkit for content creators who stream directly from their browsers. These creators need tools to maintain professionalism, privacy, and audience engagement, all while expressing their unique branding and creativity. Failing to meet this challenge might lead to increased costs, stifled innovation, reduced competitiveness, and could even affect the quality of the content and its appeal to viewers. To remain an appealing platform for content creators, Livepeer must tackle this challenge.

The Solution:

To address this multifaceted challenge, we propose the implementation of integrated AI models. These models will serve as an extension of the existing Livepeer player and SDK (Broadcast component), empowering it with the capability to deliver real-time closed captions, subtitles, and dynamic language translations; running bot for content moderation and embedded filters and generative virtual backgrounds . This transformative enhancement will not only amplify accessibility, creativity and content safety but also augment the platform's overall appeal. It emphasizes Livepeer's dedication to inclusivity and content moderation, while recognizing the diversity of its user base and the crucial importance of offering an engaging, accessible, safe and creative streaming experience (through embedded virtual backgrounds).This strategic endeavor solidifies Livepeer's position at the vanguard of web3 streaming and underlines its unwavering dedication to pioneering advancements in the streaming sphere.

Our student community

Expanding into a micro community, specifically a student club for Africans and individuals globally interested in contributing to the AI features within the Livepeer ecosystem, represents a strategic initiative with broad-reaching implications. This community not only serves as a platform for learning and collaboration but also holds the potential to significantly enhance Livepeer's exposure, particularly within the Africa.

Please describe in details what your final deliverable for this project will be.

Extended Livepeer SDK (Player and broadcast component) with Real-Time Closed Captions, Subtitles, Language Translation,AI moderation bot and AI generative virtual backgrounds (Stream from browser) :

Technical Implementation: The extended Livepeer (SDK) player will feature a robust AI-driven framework that seamlessly integrates real-time closed captioning, subtitles, language translation, moderation bot and generative background . It will employ advanced Natural Language Processing (NLP),speech recognition algorithms, computer vision and machine learning models ensuring high accuracy and minimal latency.

User-Friendly Interface: The player's user interface will offer viewers the flexibility to enable or disable closed captions and subtitles, choose from a variety of supported languages for translation, and customize the display style to suit individual preferences.Content creators will be able to select from generated backgrounds while streaming from browser.

Real time running moderation bot integrated into the Player - SDK will provide configuration for means of report i.e dashboards, emails or telegram

Developer SDK for Integration in ReactJS, React Native, and Flutter:

Modular and Versatile SDK: The Developer SDK will be a set of modular libraries and components tailored for ReactJS and React Native. It will facilitate straightforward integration into web and mobile applications.

Comprehensive Documentation: Extensive technical documentation will accompany the SDK, providing developers with clear instructions, usage examples, and best practices for seamless integration into their projects.

Support for Platform-Specific Features: The SDK will harness the native capabilities of each platform, ensuring an optimized user experience. This includes considerations for responsive design, touch gestures, and platform-specific UI guidelines.

Developer Community Engagement: The project will actively engage with developer communities, collecting feedback and addressing integration challenges promptly. Continuous updates and improvements will be part of the SDK's roadmap.

Articles and Development Guides for Developers and Streamers:

Comprehensive Tutorials: A series of in-depth articles and development guides will be created. These resources will walk developers and streamers through the setup, configuration, and utilization of the extended Livepeer platform, and the Developer SDK.

Please break up your development work into a clear set of milestones

Milestones

Proof of concept(POC) - to demonstrate implementation ($4500) Proof of concept live captions, subtitles and language translation Proof of concept AI moderation bot
```
Three developers- Backend, AI and Frontend developer , $1500 USD each.
```
Proof of concept for AI generative background -$1500 Two developers -AI and Frontend developer - $750 each
Full extended Livepeer SDK(Player) Reactjs and React native -with the below features($3000) Real time close captions and subtitles AI moderation and reporting AI generative virtual backgrounds

Three developers- Backend, AI and Frontend developer , $500 USD each. Server hosting, deployment and testing-$1500
Developer contents and articles for streamers.($2000) Developer guide for SDK in Reactjs and React Native

2 Technical writer - $1000 x2
Community Building - A.I Livepeer University of Port Harcourt($2000)
- A student community for Africans or global , for individuals willing to contribute to the growth of AI in Livepeer ecosystem* 2 Community moderator and recruiter -$1000 x2

Sum up the total requested budget across all milestones, and include that figure here. Also, please include a budget breakdown to specify how you are planning to spend these funds.

Total Budget Requested: USD $13,000

Breakdown:

Hiring of Engineers - $7500 Development & Testing- $500 Deployment Resources-server hosting -$1000 Content Development -$2000 Community building of A.I Livepeer University of Port Harcourt, Nigeria - $2000

Specify your team's long-term plans to maintain this software and upgrade it over time

The project will be open-source and maintained by the me and the team . The core team will be responsible for fixing bugs, adding new features(AI tooling) , and upgrading the application over time.
Overtime we plan to expand the team with more developers and also create a micro community for bug bounties.

Long term plan is to integrate as much AI solutions into the player .We are going to build a micro(student) community(A.I Livepeer university of Port Harcourt, Nigeria) , this community will welcome individuals with similar goal to improve AI solutions or tooling in Live peer ecosystem.

Please describe (in words) your team's relevant experience, and why you think you are the right team to build this project. You can cite your team's prior experience in similar domains, doing similar dev work, individual team members' backgrounds, etc.

I am a blockchain, AI engineer and full stack web developer with over 4 years of experience , with keen interest in dapps development, AI and media tech stack. I worked on Freetyl (Joystick labs), a grantee of Livepeer video distributor grant, where I architected the current architecture for in-game streaming and streaming from browser to Livepeer .

I have been in the past, a multi prize winner in different hackathon hosted by the Encode community including the Next video build hackathon 2022/2023. I am involved in several blockchain communities such as Reach, Bundlr , Chainlink e.t.c .Some of which are I participated as an open source contributor.

I also created a small fun project called Puddle Network to simulate the Livepeer ecosystem. This project was a prize winner in the last Encode Future blockchain university hackathon 2023.

Recently i have been more involved in communities interested in bridging AI and blockchain. My team members are well proficient in their stacks and have been involved in several enterprise level projects in the past and are very keen in using their experience to make innovative improvement on the Livepeer SDK.

Mail: bartholomewonogwu@yahoo.com Github:https://github.com/scapula07/ Discord : ene#5408 Location: Port Harcourt ,Nigeria

Team members

Kotai Soen : Frontend developer , web and mobile developer
Precious Amaechi : Backend developer
Oyale Peter : AI engineer,Full stack web and mobile developer

Who is your target user group? How do you plan on getting your users to use this?

Targeted user group are developers, streamers and viewers . To onboard more streamers and end viewers onto the Livepeer ecosystem.

How did you learn about the Livepeer Grants Program?

Livepeer discord community .

Was this project started at a hackathon or another web3 event? Which one?

No but i got the ideal from participating in Encode Next video build hackathon 2022-2023

Please include any additional information that you think would be useful in helping us to evaluate your proposal.

No response

Do you consent to using "Powered by Livepeer" watermark on your application?

[X] Yes
[ ] No

IxaBrjnko commented 8 months ago

Which advanced Natural Language Processing (NLP) and speech recognition algorithms? and/or in what ways does this solution support and maintain the option to encrypt transcoded files? Also curious how this can be integrated with verifiable video signatures.

scapula07 commented 8 months ago

Which advanced Natural Language Processing (NLP) and speech recognition algorithms? and/or in what ways does this solution support and maintain the option to encrypt transcoded files? Also curious how this can be integrated with verifiable video signatures.

For Speech recognition algorithm ,the team discussed several options we could try . Leveraging Deep learning in particular.

Google Speech Text recognition ,Whisper Ai(open Ai) are AI apis we are going to try for this feature. https://pypi.org/project/SpeechRecognition/

scapula07 commented 8 months ago

Which advanced Natural Language Processing (NLP) and speech recognition algorithms? and/or in what ways does this solution support and maintain the option to encrypt transcoded files? Also curious how this can be integrated with verifiable video signatures.

The solution has provision for encrypting the text tracks. We are using the same encryption algorithm used by Azure media service , AES Clear key encryption.

As per video encryption , we are under two implementation architecture. One involves bundling(embedding ) the caption text and video stream in a proxy server , Common Encryption (CENC) or Media Encryption and Media Source Extensions (MSE) will be employed. The second involves extracting the audio client side and running it through a speech recognition algorithm to extract the caption text. We are comparing both implementation for lag and latency.

The extended player will support encrypted media.

dob commented 8 months ago

Hello. I was curious where you envision the models living, and where you envision the compute for inference running? Can it all be packaged into the frontend player/SDK distribution, or are you proposing implementing these into the Livepeer nodes that perform the video transcoding?

scapula07 commented 8 months ago

Hello. I was curious where you envision the models living, and where you envision the compute for inference running? Can it all be packaged into the frontend player/SDK distribution, or are you proposing implementing these into the Livepeer nodes that perform the video transcoding?

Hi dob. Our team discussed a couple of approaches for this . Our foremost thought is running these models locally on a user/viewer machine(browser or mobile phone). On-device machine learning/AI as it is called. We plan integrating these models in the Livepeer SDK and have it run locally on the user device. This provides a couple of advantages such as Low latency, privacy etc.

We plan model compression and quantization for resource management on the user ends.

We are still trying out several ideals like including the model in the nodes.

ene#5408 is my discord username ,can you send me a friend request .I can explain more on our approach to you .

scapula07 commented 8 months ago

Hello. I was curious where you envision the models living, and where you envision the compute for inference running? Can it all be packaged into the frontend player/SDK distribution, or are you proposing implementing these into the Livepeer nodes that perform the video transcoding?

The inference running on device is the most reasonable for low latency, costs and scale.

I will really like to know what you are thinking, approach wise .

dob commented 8 months ago

My high level view is that I think it really depends on the job type and compute requirement. Some types of jobs performed once during transcode on high performance GPUs means that the results can be used over and over again across all viewers without having the redo the compute. Others might be so light in the frontend SDK, or require some client-specific input, that it makes sense to do them in the front end.

One thing is for sure though, that you will have a much more complex path of getting jobs integrated end to end across the network, than you will just handling them in the frontend alone. However adding the capability to the network to support many jobs types, including these types of AI video compute, is definitely a high priority roadmap item.

scapula07 commented 8 months ago

My high level view is that I think it really depends on the job type and compute requirement. Some types of jobs performed once during transcode on high performance GPUs means that the results can be used over and over again across all viewers without having the redo the compute. Others might be so light in the frontend SDK, or require some client-specific input, that it makes sense to do them in the front end.

One thing is for sure though, that you will have a much more complex path of getting jobs integrated end to end across the network, than you will just handling them in the frontend alone. However adding the capability to the network to support many jobs types, including these types of AI video compute, is definitely a high priority roadmap item.

Yes dob, The team agree on your take . The team will consider both approaches and see which works best .And would go with the most inclusive and scalable whilst being cost and resource effective.

Tensorflow lite will be used to build and deploy the models to edge devices, thus we might be able circumvent to an extent compute limitation.

We plan on integrating more AI solutions for as many job types after these features to the network . The micro community (A.I Livepeer university of port Harcourt) we are going to build, will help contribute to bring optimizable approaches to these future job types in the network.

scapula07 commented 8 months ago

@hansy .Hi , Great Day. My team was curious on your feed back on our application.

hansy commented 8 months ago

Thanks for the great discussion @scapula07! Quick question, you mentioned you're working on another grant with us? This one I believe. This grant is still in progress; we typically do not award multiple simultaneous grants to the same recipient.

Our team (and the community) is certainly interested in this proposal, but we'd love if you were to finish the one we've already awarded first. Then I'm happy to revisit this.

scapula07 commented 8 months ago

@hansy , Hi. To clarify ,I am not the recipient of the above mentioned grant. I introduced Livepeer to the Joystick labs team ,since i participate a lot in hackathon even Encode Next video build by Livepeer. I helped the team come up with a systematic approach to their first milestone.

My team ,A.I LIvepeer University of Port Harcourt is no way the same as the team from the above grant. I mentioned the above grant to show i have a good low level knowledge of the Livepeer network and architecture and my team capability to handle this project

My team and I are excited about introducing AI into the Livepeer network. But we can go with you suggestion to wait for the completion of the above grant. Just wanted to clarify my team's involvement.

I am happy to have this discussion with you on a dm. ene#5408 is my discord username ,can you send me a friend request .

Thank you sir

github-actions[bot] commented 7 months ago

This issue has been marked as stale with no activity. It will close in 7 days.

scapula07 commented 6 months ago

https://pypi.org/project/SpeechRecognition/

github-actions[bot] commented 5 months ago

This issue has been marked as stale with no activity. It will close in 7 days.

scapula07 commented 5 months ago

https://webrtcventures.github.io/background-removal-insertable-streams/

livepeer / grants