lobsam commented 11 months ago

RFW0122: Text-to-Speech (TTS) with Diverse Accents and Gender

Summary

The goal of this RWF is to expand our existing Text-to-Speech (TTS) to encompass a wider range of accents and genders

Key Concepts

Text-to-Speech (TTS): The process of converting written text into spoken words.
Accents : Variations in pronunciation and intonation characteristic of different regions or linguistic backgrounds.
Text corpus : A large unstructured collection of texts.

Context

Our current TTS lacks diversity in accents and gender representation, limiting its applicability . This RFW aims to address these limitations by including a broader range of accents and ensuring representation of various genders.

Inputs

Text Corpus (diverse set of texts representing different linguistic styles).
Voice Samples (High-quality recordings of native speakers with various accents and genders).

Outputs

Create TTS models that accurately reproduce accents from different regions
Gender representation (male, female)

Timeline

Specify the expected delivery date for the project.

References

Include any relevant links or resources for additional context or information.

gangagyatso4364 commented 11 months ago

what age groups are we going to focus here? why do we attempt to reproduce the accents from different region. would'nt it be better to focus one single accent as an output at this stage.

spsither commented 11 months ago

Ask when we are likely to get different dialect data. We need around 40 hours for each dialect.

OpenPecha / Requests