Closed daemon closed 4 years ago
Data preprocessing is currently split into multiple steps, i.e.,
run.preprocess_dataset
*.lab
run.export_mfa
mfa_align
jsonl
run.attach_mfa_alignment
We should make this process easier and document it somewhere.
Data preprocessing is currently split into multiple steps, i.e.,
run.preprocess_dataset
.*.lab
files usingrun.export_mfa
.mfa_align
) over the speech corpus.jsonl
format (run.attach_mfa_alignment
).We should make this process easier and document it somewhere.