Open DanielSWolf opened 1 year ago
Yeah that sounds reasonable. The kaldi feedstock on conda-forge should have shared objects/dll/dylibs that can be linked when you're building (including the OpenFst .16.so libraries) and they should have all the symbols exported that you need (at least they export the ones that the binaries use).
in terms of alignment, there is some documentation here with source code: https://montreal-forced-aligner.readthedocs.io/en/latest/reference/alignment/generated/montreal_forced_aligner.alignment.mixins.AlignMixin.html#montreal_forced_aligner.alignment.mixins.AlignMixin.align_utterances, but in general, the alignment functions just follow https://github.com/kaldi-asr/kaldi/tree/master/egs/wsj/s5/steps/align_si.sh and https://github.com/kaldi-asr/kaldi/tree/master/egs/wsj/s5/steps/align_fmllr.sh.
any progress? I use python source code instead of command_line to forced alignment.But the alignment speed is still slow,due to the Kaldi style feature.
@mmcauliffe Thank you for the Kaldi recipes and for the tip with the pre-built binaries!
@BarryKCL This is for a personal project of mine, and I don't have much free time at the moment. I'll certainly update this issue once I've made progress, but that may take some time.
I find a implementation in: https://github.com/open-speech/speech-aligner
I'm writing a native application (Rust, C/C++) that needs to perform forced alignment. Not training, just the alignment part. So I'm wondering how best to integrate MFA into an application.
My understanding is that most of MFA's code is for training. Once the models exist, the
align
command seems to primarily call a number of Kaldi binaries, which in turn are just thin wrappers around the Kaldi library. So my idea is to compile Kaldi as part of my application, then call the appropriate Kaldi functionality directly. This way, the entire forced alignment functionality could reside in my application's executable.So I'm wondering: