MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi
https://montrealcorpustools.github.io/Montreal-Forced-Aligner/
MIT License
1.26k stars 242 forks source link

[BUG] MFA Align required Administrator Mode to Work On Windows #780

Closed x-hephestus closed 3 months ago

x-hephestus commented 3 months ago

Debugging checklist

[YES ] Have you read the troubleshooting page (https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/troubleshooting.html) and searched the documentation to ensure that your issue is not addressed there? [YES] Have you updated to latest MFA version (check https://montreal-forced-aligner.readthedocs.io/en/latest/changelog/changelog_3.0.html)? What is the output of mfa version? [YES ] Have you tried rerunning the command with the --clean flag?

Describe the issue When running MFA Align on windows, the multiprocessing.py script will fail because Windows requires administrator mode to create symbolic links:

multiprocessing.py line 613: job.construct_path( self.working_directory, "ali", "ark", dict_id ).symlink_to(ali_path)

There is an exception for OSError that catches this, however the shutil.copyfile() that is meant to be used in place of the symbolic links has the source and destination file arguments reversed:

multiprocessing.py line 623: shutil.copyfile(job.construct_path(self.working_directory, "ali", "ark", dict_id), ali_path,)

In the above copyfile() it's trying to copy the new ali .ark file into the existing ali first pass .ark file, which is backwards from the symbolic link trying to create the new ali .ark file as a symbolic link to the existing ali first pass .ark file. This results in a File Not Found Error when attempting to run without administrator mode active.

I've locally tested flipping the copyfile() arguments around and it allows the process to run successfully.

For Reproducing your issue Please fill out the following:

  1. Corpus structure

    • What language is the corpus in? English

    • How many files/speakers? One

    • Are you using lab files or TextGrid files for input? Txt

  2. Dictionary

    • Are you using a dictionary from MFA? If so, which one? English ARPA 3.0
  3. Acoustic model

    • If you're using an acoustic model, is it one download through MFA? If so, which one? English ARPA 3.0

Desktop (please complete the following information):

lcavasso commented 3 months ago

@x-hephestus very cool that you were able to make a fix for this issue. I see this is your first issue on GitHub. Nice work!

If you want to go a step further, you could fork this project, make the change in your fork, and do a pull request to make it a little easier for mmcauliffe to implement. Ideally you could mention this issue in the pull request.

It would also help people like me who needed to spend some time finding the correct multiprocessing.py (it's the one at montreal_forced_aligner\alignment) and who might not understand what exactly needs to be changed - which is to change lines 622-636 to

except OSError:
    shutil.copyfile(
        ali_path,
        job.construct_path(self.working_directory, "ali", "ark", dict_id)
    )
    shutil.copyfile(
        words_path,
        job.construct_path(self.working_directory, "words", "ark", dict_id)                            
    )
    shutil.copyfile(
        likes_path,
        job.construct_path(
            self.working_directory, "likelihoods", "ark", dict_id
        )
    )

(with appropriate tabs)