MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi
https://montrealcorpustools.github.io/Montreal-Forced-Aligner/
MIT License
1.29k stars 242 forks source link

Original word case (upper/lower) is not retained #675

Open SAGNIKMJR opened 1 year ago

SAGNIKMJR commented 1 year ago

Is your feature request related to a problem? Please describe. A. mfa align does not retain the original case of the words in the source TextGrid file.

Describe the solution you'd like A. Add an option to retain the original case.

Additional context A lot of the modern language tokenizers account for word/token case. So if one needs to use the output of mfa align for some further preprocessing, like language tokenization, it could be useful to have the words in their original case.