Get the decoded text for segmenting the labeled data - Githubissues

facebookresearch / voxpopuli

A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation

Other

510 stars 51 forks source link

Get the decoded text for segmenting the labeled data #45

Open AvivNavon opened 1 year ago

AvivNavon commented 1 year ago

I wish to segment the labeled data. Is it possible to get the decoded text corresponding to each audio segment? (https://github.com/facebookresearch/voxpopuli/blob/main/extension.md#customizing-force-alignment-for-transcribed-asr-data)