Describe the issue
Using a g2p model with the align_one command makes mfa crash.
For Reproducing your issue
Please fill out the following:
Corpus structure
What language is the corpus in?
English
How many files/speakers?
1
Are you using lab files or TextGrid files for input?
.lab
Dictionary
Are you using a dictionary from MFA? If so, which one?
english_mfa (3.0.0)
Acoustic model
If you're using an acoustic model, is it one download through MFA? If so, which one?
english_mfa (3.0.0)
**4. G2P model
If you're using an G2P model, is it one download through MFA? If so, which one?
english_mfa (3.0.0)**
Log file
(mfaPR) [sloppine@zained ~]$ mfa align_one --clean **--g2p_model_path ~/Documents/MFA/pretrained_models/g2p/**english_us_mfa.zip ~/Documents/Alignement-force/data/Homemade/test_terraria.wav ~/Documents/Alignement-force/data/Homemade/test_terraria.lab english_mfa english_mfa ~/Documents/Alignement-force/mfa/dataPR/
Exception ignored in atexit callback: <bound method ExitHooks.history_save_handler of <montreal_forced_aligner.command_line.mfa.ExitHooks object at 0x7cfbe4108b10>>
Traceback (most recent call last):
File "/home/sloppine/.conda/envs/mfaPR/lib/python3.11/site-packages/montreal_forced_aligner/command_line/mfa.py", line 107, in history_save_handler
raise self.exception
File "/home/sloppine/.conda/envs/mfaPR/bin/mfa", line 10, in <module>
sys.exit(mfa_cli())
^^^^^^^^^
File "/home/sloppine/.conda/envs/mfaPR/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sloppine/.conda/envs/mfaPR/lib/python3.11/site-packages/rich_click/rich_command.py", line 126, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/home/sloppine/.conda/envs/mfaPR/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sloppine/.conda/envs/mfaPR/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sloppine/.conda/envs/mfaPR/lib/python3.11/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sloppine/.conda/envs/mfaPR/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sloppine/.conda/envs/mfaPR/lib/python3.11/site-packages/montreal_forced_aligner/command_line/align_one.py", line 184, in align_one_cli
ctm = align_utterance_online(
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sloppine/.conda/envs/mfaPR/lib/python3.11/site-packages/montreal_forced_aligner/online/alignment.py", line 57, in align_utterance_online
lexicon_compiler.add_pronunciation(KalpyPronunciation(w, pron[0]))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: __init__() missing 5 required positional arguments: 'probability', 'silence_after_probability', 'silence_before_correction', 'non_silence_before_correction', and 'disambiguation'
(mfaPR) [sloppine@zained ~]$ mfa version
3.0.1
Desktop:
OS: Linux
Version: Arch Linux - 6.7.6-arch1-1
Additional context
Without the g2p option or if there are no unknown words in the dictionary that cause g2p to be used, the align_one function works well. You can find the test files (.lab and .wav) right here : https://we.tl/t-90pVdp1MzT
All that being said, I've worked on a fix with @NiziL.
As suggested by the error described in the Log file, the error comes from the Pronunciation class in lexicon.py :
@dataclassy.dataclass
class Pronunciation:
"""
Data class for storing information about a particular pronunciation
"""
orthography: str
pronunciation: str
probability: typing.Optional[float]
silence_after_probability: typing.Optional[float]
silence_before_correction: typing.Optional[float]
non_silence_before_correction: typing.Optional[float]
disambiguation: typing.Optional[int]
The Optional class can be used to specify a float or None value. Although attributes are declared as Optional, they have no default values. Consequently, when creating an instance of Pronunciation, you need to put values to these float parameters or at least None.
So the best solution is to set the attributes to None by default if no value is specified :
@dataclassy.dataclass
class Pronunciation:
"""
Data class for storing information about a particular pronunciation
"""
orthography: str
pronunciation: str
probability: typing.Optional[float] = None
silence_after_probability: typing.Optional[float] = None
silence_before_correction: typing.Optional[float] = None
non_silence_before_correction: typing.Optional[float] = None
disambiguation: typing.Optional[int] = None
Once this first bug has been solved, a second one will emerge:
(the --clean option is not in the command below voluntarily, you'll see why just after)
(mfa) [sloppine@zained mfa]$ mfa align_one --g2p_model_path ~/Documents/MFA/pretrained_models/g2p/english_us_mfa.zip ~/Documents/Alignement-force/data/Homemade/test_terraria.wav ~/Documents/Alignement-force/data/Homemade/test_terraria.lab english_mfa english_mfa ~/Documents/Alignement-force/mfa/dataPR/
Exception ignored in atexit callback: <bound method ExitHooks.history_save_handler of <montreal_forced_aligner.command_line.mfa.ExitHooks object at 0x7191eb8f3550>>
Traceback (most recent call last):
File "/home/sloppine/.conda/envs/mfa/lib/python3.10/site-packages/montreal_forced_aligner/command_line/mfa.py", line 107, in history_save_handler
raise self.exception
File "/home/sloppine/.conda/envs/mfa/bin/mfa", line 10, in <module>
sys.exit(mfa_cli())
File "/home/sloppine/.conda/envs/mfa/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/home/sloppine/.conda/envs/mfa/lib/python3.10/site-packages/rich_click/rich_command.py", line 126, in main
rv = self.invoke(ctx)
File "/home/sloppine/.conda/envs/mfa/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/sloppine/.conda/envs/mfa/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/sloppine/.conda/envs/mfa/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/home/sloppine/.conda/envs/mfa/lib/python3.10/site-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
File "/home/sloppine/.conda/envs/mfa/lib/python3.10/site-packages/montreal_forced_aligner/command_line/align_one.py", line 187, in align_one_cli
ctm = align_utterance_online(
File "/home/sloppine/.conda/envs/mfa/lib/python3.10/site-packages/montreal_forced_aligner/online/alignment.py", line 58, in align_utterance_online
lexicon_compiler.add_pronunciation(KalpyPronunciation(w, pron[0]))
File "/home/sloppine/.conda/envs/mfa/lib/python3.10/site-packages/kalpy/fstext/lexicon.py", line 640, in add_pronunciation
self._fst.add_arc(
File "extensions/_pywrapfst.pyx", line 2113, in _pywrapfst.MutableFst.add_arc
TypeError: an integer is required
The add_arc call that leads to the exception is:
self.non_silence_state, # Because of self.non_silence_state which is None
pywrapfst.Arc(
arc.ilabel,
word_symbol,
pywrapfst.Weight(
self._fst.weight_type(), pron_cost + non_silence_before_cost
),
arc.nextstate + start_index,
),
)
The error comes from the same file (lexicon.py).
When self._fst is loaded trough load_l_from_file, the silence_state and non_silence_state are left uninitialized, which lead to this error when adding a new pronunciation with the g2p model :
def load_l_from_file(
self,
l_fst_path: typing.Union[pathlib.Path, str],
) -> None:
"""
Read g.fst from file
Parameters
----------
l_fst_path: :class:`~pathlib.Path` or str
Path to read HCLG.fst
"""
self._fst = pynini.Fst.read(str(l_fst_path))
This is not the case when the self._fst is completely created from scratch. (See below):
def create_fsts(self, phonological_rule_fst: pynini.Fst = None):
if self._fst is not None and self._align_fst is not None:
return
initial_silence_cost = 0
initial_non_silence_cost = 0
if self.initial_silence_probability:
initial_silence_cost = -1 * math.log(self.initial_silence_probability)
initial_non_silence_cost = -1 * math.log(1.0 - self.initial_silence_probability)
final_silence_cost = 0
final_non_silence_cost = 0
if self.final_silence_correction:
final_silence_cost = -math.log(self.final_silence_correction)
final_non_silence_cost = -math.log(self.final_non_silence_correction)
self.phone_table.find(self.silence_disambiguation_symbol)
phone_eps_symbol = self.phone_table.find("<eps>")
self.word_table.find(self.silence_word)
self._fst = pynini.Fst()
self._align_fst = pynini.Fst()
self.start_state = self._fst.add_state()
self._align_fst.add_state()
self._fst.set_start(self.start_state)
self.non_silence_state = self._fst.add_state() # INITIALIZED HERE
self._align_fst.add_state()
self.silence_state = self._fst.add_state() # INITIALIZED HERE
self._align_fst.add_state()
Therefore, silence_state and non_silence_state must also be initialized during loading as follows:
def load_l_from_file(
self,
l_fst_path: typing.Union[pathlib.Path, str],
) -> None:
"""
Read g.fst from file
Parameters
----------
l_fst_path: :class:`~pathlib.Path` or str
Path to read HCLG.fst
"""
self._fst = pynini.Fst.read(str(l_fst_path))
self.non_silence_state = 1 # INITIALIZED NOW
self.silence_state = 2 # INITIALIZED NOW
We've set 1 for non_silence_state and 2 for silence_state to match the numbers assigned during creation with the add_state method. add_state simply returns a new state ID depending on the last state id given. This new id is just initialized to (last state id + 1)
To come back to the fact that we haven't set the --clean option. In fact, if we add the --clean option after fixing the first bug. It will work. I guess it's because --clean makes L.fst be deleted such as temporary files as you said in your documentation, but we'll lose performance.
Finally, we know that this fix for the kalpy library is not montreal itself, but it was impossible to find the kalpy repository online and, since you are the creator and the maintainer of kalpy on pypi and conda-forge, we took the liberty of posting the issue and the fix here.
Debugging checklist
[Y] Have you read the troubleshooting page (https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/troubleshooting.html) and searched the documentation to ensure that your issue is not addressed there? [Y] Have you updated to latest MFA version (check https://montreal-forced-aligner.readthedocs.io/en/latest/changelog/changelog_3.0.html)? What is the output of
mfa version
? [Y] Have you tried rerunning the command with the--clean
flag?Describe the issue Using a g2p model with the align_one command makes mfa crash.
For Reproducing your issue Please fill out the following:
Log file
Desktop:
Additional context Without the g2p option or if there are no unknown words in the dictionary that cause g2p to be used, the align_one function works well. You can find the test files (.lab and .wav) right here : https://we.tl/t-90pVdp1MzT
All that being said, I've worked on a fix with @NiziL.
As suggested by the error described in the Log file, the error comes from the Pronunciation class in lexicon.py :
The Optional class can be used to specify a float or None value. Although attributes are declared as Optional, they have no default values. Consequently, when creating an instance of Pronunciation, you need to put values to these float parameters or at least None. So the best solution is to set the attributes to None by default if no value is specified :
Once this first bug has been solved, a second one will emerge: (the --clean option is not in the command below voluntarily, you'll see why just after)
The add_arc call that leads to the exception is:
The error comes from the same file (lexicon.py). When self._fst is loaded trough load_l_from_file, the silence_state and non_silence_state are left uninitialized, which lead to this error when adding a new pronunciation with the g2p model :
This is not the case when the self._fst is completely created from scratch. (See below):
Therefore, silence_state and non_silence_state must also be initialized during loading as follows:
We've set 1 for non_silence_state and 2 for silence_state to match the numbers assigned during creation with the add_state method. add_state simply returns a new state ID depending on the last state id given. This new id is just initialized to (last state id + 1)
To come back to the fact that we haven't set the --clean option. In fact, if we add the --clean option after fixing the first bug. It will work. I guess it's because --clean makes L.fst be deleted such as temporary files as you said in your documentation, but we'll lose performance.
Finally, we know that this fix for the kalpy library is not montreal itself, but it was impossible to find the kalpy repository online and, since you are the creator and the maintainer of kalpy on pypi and conda-forge, we took the liberty of posting the issue and the fix here.