MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi
https://montrealcorpustools.github.io/Montreal-Forced-Aligner/
MIT License
1.31k stars 243 forks source link

MFA align output change original text #805

Open ponymhc opened 4 months ago

ponymhc commented 4 months ago

Debugging checklist

[x] Have you read the troubleshooting page (https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/troubleshooting.html) and searched the documentation to ensure that your issue is not addressed there? [x] Have you updated to latest MFA version (check https://montreal-forced-aligner.readthedocs.io/en/latest/changelog/changelog_3.0.html)? What is the output of mfa version? 3.0.7 [x] Have you tried rerunning the command with the --clean flag? yes

Describe the issue A clear and concise description of what the bug is.

When I checked the output of the MFA alignment, I found a case where some entities were changed to be different from the original text. The characters have changed from Simplified Chinese to Traditional Chinese. Here is the content from the .lab file:

霞浦县牙城镇乌岐,瓦窑村水位猛涨。

This is the output of the MFA alignment:

File type = "ooTextFile" Object class = "TextGrid"

xmin = 0 xmax = 4.150354 tiers? size = 2 item []: item [1]: class = "IntervalTier" name = "words" xmin = 0 xmax = 4.150354 intervals: size = 8 intervals [1]: xmin = 0.0 xmax = 0.64 text = "霞浦縣" intervals [2]: xmin = 0.64 xmax = 0.99 text = "牙城鎮" intervals [3]: xmin = 0.99 xmax = 2.08 text = "烏岐" intervals [4]: xmin = 2.08 xmax = 2.42 text = "" intervals [5]: xmin = 2.42 xmax = 3.09 text = "瓦窯村" intervals [6]: xmin = 3.09 xmax = 3.53 text = "水位" intervals [7]: xmin = 3.53 xmax = 4.12 text = "猛漲" intervals [8]: xmin = 4.12 xmax = 4.150354 text = "" item [2]: class = "IntervalTier" name = "phones" xmin = 0 xmax = 4.150354 intervals: size = 12 intervals [1]: xmin = 0.0 xmax = 0.64 text = "spn" intervals [2]: xmin = 0.64 xmax = 0.99 text = "spn" intervals [3]: xmin = 0.99 xmax = 2.08 text = "spn" intervals [4]: xmin = 2.08 xmax = 2.42 text = "" intervals [5]: xmin = 2.42 xmax = 3.09 text = "spn" intervals [6]: xmin = 3.09 xmax = 3.22 text = "ʂ" intervals [7]: xmin = 3.22 xmax = 3.26 text = "w" intervals [8]: xmin = 3.26 xmax = 3.33 text = "ej˨˩˦" intervals [9]: xmin = 3.33 xmax = 3.41 text = "w" intervals [10]: xmin = 3.41 xmax = 3.53 text = "ej˥˩" intervals [11]: xmin = 3.53 xmax = 4.12 text = "spn" intervals [12]: xmin = 4.12 xmax = 4.150354 text = ""

For Reproducing your issue Please fill out the following:

  1. Corpus structure
    • What language is the corpus in? mandarin
    • How many files/speakers? only 1
    • Are you using lab files or TextGrid files for input? .lab
  2. Dictionary
    • Are you using a dictionary from MFA? If so, which one? I tried both mandarin_mfa and mandarin_china_mfa, but encountered the same issue.
    • If it's a custom dictionary, what is the phoneset?
  3. Acoustic model
    • If you're using an acoustic model, is it one download through MFA? If so, which one? mandarin_mfa
    • If it's a model you've trained, what data was it trained on?

Log file Please attach the log file for the run that encountered an error (by default these will be stored in ~/Documents/MFA).

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

Logible commented 2 months ago

I got the same problem