MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi
https://montrealcorpustools.github.io/Montreal-Forced-Aligner/
MIT License
1.35k stars 248 forks source link

[BUG] G2P Separates Diacritics From Attached Symbols (Also Attaching "Sil") #846

Open NataliaShmueli opened 2 weeks ago

NataliaShmueli commented 2 weeks ago

Debugging checklist

[ x] Have you read the troubleshooting page (https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/troubleshooting.html) and searched the documentation to ensure that your issue is not addressed there? [ x] Have you updated to latest MFA version (check https://montreal-forced-aligner.readthedocs.io/en/latest/changelog/changelog_3.0.html)? What is the output of mfa version? [ x] Have you tried rerunning the command with the --clean flag?

Describe the issue A clear and concise description of what the bug is. In 3.2.0 the model with and without the --phonetisaurus tag is separating the accent marks from attached consonants and vowels.

For Reproducing your issue Please fill out the following:

  1. Corpus structure
    • What language is the corpus in? N/A
    • How many files/speakers? N/A
    • Are you using lab files or TextGrid files for input? N/A
  2. Dictionary
    • Are you using a dictionary from MFA? If so, which one? Personal
    • If it's a custom dictionary, what is the phoneset? X-SAMPA
  3. Acoustic model
    • If you're using an acoustic model, is it one download through MFA? If so, which one? N/A
    • If it's a model you've trained, what data was it trained on? N/A

Log file Please attach the log file for the run that encountered an error (by default these will be stored in ~/Documents/MFA).

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

Incorrect example: image

Correct example from 3.1.3: image

NataliaShmueli commented 2 weeks ago

Ah, another thing. You cannot link to a .txt in 3.2.0 for some reason example: mfa g2p K:\Dictionaries\German\OOVS\MagicHub\oovs_found_GermanDictionary.txt K:\Dictionaries\German\GermanG2P.zip K:\Dictionaries\German\OOVS\MagicHub\MagicHubOutput.txt