Open Wuyazixu opened 2 years ago
Do we have any example files for these two? /fairseq/examples/speech_to_text/prep_librispeech_data.py creates .tsv file with audio file in each line. Is this the right format?
Hi, you could use this script to generate label files
suppose you have one audio file at /path/to/root/name.wav
with transcript "how are you"
test.tsv is like
/path/to/root/
name.wav\t<num_of_sample>
the guide of exmples/hubert is below:
Decode a HuBERT model Suppose the test.tsv and test.ltr are the waveform list and transcripts of the split to be decoded, saved at /path/to/data
but there is no guide how to gegnerate .tsv and .ltr file.Dose some one know?