[ x] Have you updated to latest MFA version? 2.0.6
[x ] Have you tried rerunning the command with the --clean flag?
Describe the issue
When I run with output_format=csv, some words that are separated by space in the text wind up on the same row glommed together. For example, "bonjour j'organise" in the text became "bonjourj'organise" in the csv output.
For Reproducing your issue
mfa align testinput1 french_mfa french_mfa aligntest1csvoutput --clean --output_format=csv
testinput1 contains ftelpv29_chunk1.txt and ftelpv29_chunk1.wav (attached)
aligntest1csvoutput contains ftelpv29_chunk1.csv (attached)
Row 3 of .csv has the glommed together word
Please fill out the following:
Corpus structure
What language is the corpus in? French
How many files/speakers? 1 file, 2 speakers
Are you using lab files or TextGrid files for input? lab files
Dictionary
Are you using a dictionary from MFA? If so, which one? french_mfa dictionary
If it's a custom dictionary, what is the phoneset?
Acoustic model
If you're using an acoustic model, is it one download through MFA? If so, which one? french_mfa
If it's a model you've trained, what data was it trained on?
Log file
Please attach the log file for the run that encountered an error (by default these will be stored in ~/Documents/MFA).
attached
Desktop (please complete the following information):
OS: [e.g. Windows, OSX, Linux] Linux
Version [e.g. MacOSX 10.15, Ubuntu 20.04, Windows 10, etc] Ubuntu 20.04
Any other details about the setup (Cloud, Docker, etc)
Additional context
Add any other context about the problem here.
pretrained_aligner.log
/github.com/MontrealCorpusTools/Montreal-Forced-Aligner/files/9902261/pretrained_aligner.log)
ftelpv29_chunk1.csv
Debugging checklist
[ x] Have you updated to latest MFA version? 2.0.6 [x ] Have you tried rerunning the command with the
--clean
flag?Describe the issue When I run with output_format=csv, some words that are separated by space in the text wind up on the same row glommed together. For example, "bonjour j'organise" in the text became "bonjourj'organise" in the csv output.
For Reproducing your issue mfa align testinput1 french_mfa french_mfa aligntest1csvoutput --clean --output_format=csv
testinput1 contains ftelpv29_chunk1.txt and ftelpv29_chunk1.wav (attached) aligntest1csvoutput contains ftelpv29_chunk1.csv (attached)
Row 3 of .csv has the glommed together word
Please fill out the following:
Log file Please attach the log file for the run that encountered an error (by default these will be stored in
~/Documents/MFA
).attached
Desktop (please complete the following information):
Additional context Add any other context about the problem here. pretrained_aligner.log /github.com/MontrealCorpusTools/Montreal-Forced-Aligner/files/9902261/pretrained_aligner.log) ftelpv29_chunk1.csv