Open wwdok opened 2 years ago
I just replace another customized corpus, this time mfa align
work successfully, so i feel that the pretrained model is strict with the dataset corpus, so what is problem of [mysishell3](https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/files/7813023/myaishell3.zip)
dataset corpus ,and what are the requirements for customized dataset corpus. Meanwhile i tried to train and adapt the acoustic model to new dataset corpus, and i used example Mandarin corpus, but it still has issue
You might want to try increasing the beam, either mfa align ... --beam=40
or mfa align ... --retry_beam=100
. The default beam is pretty strict: https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/configuration/global.html#global-options
Thanks for your tips ! But even i set mfa align --beam=1000 --retry_beam=1000
,it still not works:
Meanwhile i tried to train and adapt the acoustic model to new dataset corpus, in order to ensure the corpus is no problem, i used example Mandarin corpus from here, but it still has issue, i don't know if it is because of the lexicon, my used lexicon is the same with above:
Debugging checklist
[ ] Have you updated to latest MFA version? Yes [ ] Have you tried rerunning the command with the
--clean
flag? YesDescribe the issue A clear and concise description of what the bug is. I am trying to validate and align my own dataset with the aishell3 format, but during validate step, it reports title's error. and this lead to the failure of align.
For Reproducing your issue This is the corpus and lexicon i used: myaishell3.zip And i use the pretrained mandarin acoustic model, but this model should not affect the validate, i doubt this error is caused by my corpus, i wonder what are the requirements of customized dataset.
The log of
mfa validate
:I check out the
unalignable_files.csv
, its content is:The log of
mfa align
:Desktop (please complete the following information):