MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi
https://montrealcorpustools.github.io/Montreal-Forced-Aligner/
MIT License
1.26k stars 242 forks source link

[BUG] Transcript TextGrids' tier order is reversed or shuffled in output alignment TGs #797

Open mfaytak opened 2 months ago

mfaytak commented 2 months ago

Debugging checklist

[ x ] Have you read the troubleshooting page (https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/troubleshooting.html) and searched the documentation to ensure that your issue is not addressed there? [ ] Have you updated to latest MFA version (check https://montreal-forced-aligner.readthedocs.io/en/latest/changelog/changelog_3.0.html)? What is the output of mfa version? 3.0.0a5, but I have also seen this with other versions, I think. [ x ] Have you tried rerunning the command with the --clean flag?

Describe the issue A small issue and not actually a new one (this has come up sporadically and I am only now thinking to make a bug report on it) - when a TextGrid containing two tiers (corresponding to two different speakers speaking in the same session) is input as the transcript for alignment, both mfa train and mfa align produce an alignment TextGrid as output which changes the order of tiers relative to the transcription input (and the actual order of speakers in the audio file's channels). It's not clear whether this extends to cases with three or more tiers (since I do not have transcripts in that format).

Transcription input, with two-channel audio:

Screenshot 2024-04-21 at 6 06 44 PM

Alignment output, again with two-channel audio; alignment TG tier order is reversed relative to the transcription input:

Screenshot 2024-04-21 at 6 08 50 PM

For Reproducing your issue Please fill out the following:

  1. Corpus structure
    • What language is the corpus in? Kom
    • How many files/speakers? 17 speakers, 27 files
    • Are you using lab files or TextGrid files for input? TextGrids
  2. Dictionary
    • Are you using a dictionary from MFA? If so, which one? no, custom
    • If it's a custom dictionary, what is the phoneset? i i0 e eh a a0 o u ɨ ɨ0 ae oe ue ay ey oy uy ɨy zɨ vɨ b nb m nm f nf w nw t nt d nd s ns n nn l nl ch nch j nj y nny k nk g ng ŋ gh '
  3. Acoustic model
    • If you're using an acoustic model, is it one download through MFA? If so, which one? no, self-trained
    • If it's a model you've trained, what data was it trained on? same corpus that it's run on

Log file Please attach the log file for the run that encountered an error (by default these will be stored in ~/Documents/MFA). No errors were thrown, but I am attaching all log files I could find. log.zip

Desktop (please complete the following information):

Additional context Add any other context about the problem here.