Closed rbracco closed 1 year ago
The issue is that MFA is basically a wrapper around Kaldi/OpenFST, so those dependencies would have to get installed somehow (see https://montreal-forced-aligner.readthedocs.io/en/latest/installation.html#installing-from-source). I am working on side project to get proper python bindings for Kaldi so that MFA doesn't have to rely on temporary files and calling Kaldi binaries, but rather just using python calls. With that said, I think conda is pretty necessary for handling all of the non-python dependencies that MFA has.
Thank you for the clarification, this is exactly what I needed to know. Is there any way to stay current on the progress of the python bindings project?
I have a few related followup questions (and then plan to the close the issue). I was wondering...
Thank you.
Yeah, that would be functionality as part of the bindings, at the moment in MFA, all acoustic models are loaded by various Kaldi binaries, and it's only G2P models that are loaded in Python.
The bindings that I've been working on are here: https://github.com/mmcauliffe/kalpy, definitely not fully featured yet, but I have low-level bindings for most of the Kaldi codebase working, and I'm working on getting more pythonic interface code set up. Currently I just have code for generating MFCCs (https://github.com/mmcauliffe/kalpy/blob/main/tests/test_mfcc.py) and compiling utterance graphs (see https://github.com/mmcauliffe/kalpy/blob/main/tests/test_decoder.py).
I'm not sure exactly when I'll have it stable enough to release to pip/conda forge, but I think once I finish up the last of the low level bindings and do some performance benchmarking of the MFCC and training graph compilation to make sure I'm not doing anything horrendously wrong for memory or speed, then I'll feel comfortable getting a release out.
Thank you for taking the time to respond, that is very helpful. Starred and following kalpy!
Is your feature request related to a problem? Please describe.
I would like to try using a trained model to generate alignments on the fly on a server without adding the complications of Conda to my deployment workflow. Is there any way to export a g2p or alignment model to be called from python without all the dependencies? If there a simpler way to setup MFA without conda?
Related issues: #40 #186 #530