google-marketing-solutions / ariel

Apache License 2.0
10 stars 2 forks source link

gTech Ads Ariel for AI Video Ad Dubbing

Ariel is an open-source Python library that facilitates efficient and cost-effective dubbing of video ads into multiple languages.

python PyPI GitHub last commit Code Style: Google Open in Colab

This is not an official Google product.

OverviewFeaturesBenefitsLanguage CompatibilityBefore You BeginGetting StartedBuilding BlocksReferences

Overview

Ariel is a cutting-edge solution designed to enhance the global reach of digital advertising. It enables advertisers to automate the translation and dubbing of their video ads into a wide range of languages.

Features

Benefits

Language Compatibility

You can dub video ads from and to the following languages:

Before You Begin

Getting started

To start using Ariel, just click on the this button: Open in Colab

Building Blocks

Ariel leverages a powerful combination of state-of-the-art AI and audio processing techniques to deliver accurate and efficient dubbing results:

  1. Video Processing: Extracts the audio track from the input video file.
  2. Audio Processing:
    • DEMUCS: Employed for advanced audio source separation.
    • pyannote: Performs speaker diarization to identify and separate individual speakers.
  3. Speech-To-Text (STT):
    • faster-whisper: A high-performance speech-to-text model.
    • Gemini 1.5 Flash: A powerful multimodal language model that contributes to enhanced transcription.
  4. Translation:
    • Gemini 1.5 Flash: Leverages its language understanding for accurate and contextually relevant translation.
  5. Text-to-Speech (TTS):
    • GCP's Text-To-Speech: Generates natural-sounding speech in the target language.
    • [OPTIONAL] ElevenLabs: An alternative API to generate speech. It's recommened for the best results. WARNING: ElevenLabs is a paid solution and will generate extra costs. See the pricing here.

References