MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi
https://montrealcorpustools.github.io/Montreal-Forced-Aligner/
MIT License
1.27k stars 242 forks source link

[BUG] <eps> is rendered as spn not sil (silence) #705

Closed danielanphd closed 10 months ago

danielanphd commented 10 months ago

Debugging checklist

[ ] Have you updated to latest MFA version? No. I had to downgrade to 2.2.17 (See issue #704) [ ] Have you tried rerunning the command with the --clean flag? Yes

Describe the issue A clear and concise description of what the bug is.

The first line of korean_mfa.dict has "\<eps> 1.0 0.0 0.0 0.0 sil", so it should render \<eps> as sil. However, when I run the alignment, the generated TextGrid instead has spn for the place where I put \<eps>. This makes the alignment incorrect, as the aligner algorithm treats end of the previous word as spn.

For Reproducing your issue Please fill out the following:

  1. Corpus structure
    • What language is the corpus in? korean_mfa
    • How many files/speakers? 2
    • Are you using lab files or TextGrid files for input? lab
  2. Dictionary
    • Are you using a dictionary from MFA? If so, which one? korean_mfa
    • If it's a custom dictionary, what is the phoneset?
  3. Acoustic model
    • If you're using an acoustic model, is it one download through MFA? If so, which one? korean_mfa
    • If it's a model you've trained, what data was it trained on?

Log file Please attach the log file for the run that encountered an error (by default these will be stored in ~/Documents/MFA).

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

danielanphd commented 10 months ago

I updated to the latest version and tried it, and the latest version still renders sil as spn.