issues
search
sanchit-gandhi
/
seq2seq-speech
Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.
34
stars
6
forks
source link
[CTC Tokenizer] Gigaspeech, SWB and AMI
#47
Closed
sanchit-gandhi
closed
2 years ago
sanchit-gandhi
commented
2 years ago
Convert GS spelt out punctuation (
<comma>
) to symbolic form (
,
)
Remove SWB disfluencies
Chunk AMI dataset
<comma>
) to symbolic form (,
)