TTS data preparation from News data

OpenPecha / news-with-audio-data

MIT License

0 stars 0 forks source link

TTS data preparation from News data #1

Open kaldan007 opened 1 week ago

kaldan007 commented 1 week ago

Description: We currently have news full audio and corresponding news transcript. We would like to get the news text and audio data split into segments to train our STT and TTS model.

Implementation:

Subtask:

[x] update voa meta data with news reader name
[x] delete the extra audio file
[x] prepare catalog as per ganga suggestion
[x] Spit audio into segments
[x] Run inference in audio
[x] Transfer news transcript

kaldan007 commented 1 week ago

please run on two point catalog

tenzinchoedon commented 1 week ago

Sr. no ID Audio link Audio text link Audio Duration (hh:mm:ss) Speaker name Speaker Gender News channel Publishing Year

tenzinchoedon commented 4 days ago

Link to the Google Sheet: https://docs.google.com/spreadsheets/d/1732g2pCbeuTUGjtep_V-8vaKAGzTK41662ehGepGdSQ/edit?usp=sharing

kaldan007 commented 3 days ago

@tenzinchoedon can u explore any existing lib to diff between male and female audio