Open lobsam opened 11 months ago
what age groups are we going to focus here? why do we attempt to reproduce the accents from different region. would'nt it be better to focus one single accent as an output at this stage.
Ask when we are likely to get different dialect data. We need around 40 hours for each dialect.
RFW0122: Text-to-Speech (TTS) with Diverse Accents and Gender
Summary
The goal of this RWF is to expand our existing Text-to-Speech (TTS) to encompass a wider range of accents and genders
Key Concepts
Text-to-Speech (TTS)
: The process of converting written text into spoken words.Accents
: Variations in pronunciation and intonation characteristic of different regions or linguistic backgrounds.Text corpus
: A large unstructured collection of texts.Context
Our current TTS lacks diversity in accents and gender representation, limiting its applicability . This RFW aims to address these limitations by including a broader range of accents and ensuring representation of various genders.
Inputs
Outputs
Timeline
Specify the expected delivery date for the project.
References
Include any relevant links or resources for additional context or information.