MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi
https://montrealcorpustools.github.io/Montreal-Forced-Aligner/
MIT License
1.3k stars 243 forks source link

Ability to Input Specific Subfolders Inside of a Corpus Directory Folder #619

Closed NataliaShmueli closed 1 year ago

NataliaShmueli commented 1 year ago

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

A feature that would be nice is something similar to how we can use sub-dictionaries would be the ability to link to sub-directories. This could be useful for corpora that are spread out or corpora that are organized by language for cross-lingual alignments without having to change your entire folder structure.

Describe the solution you'd like A clear and concise description of what you want to happen.

Similar to how linking to the .yaml for the sub-dictionaries, maybe a .yaml link to the sub-directory positions would be viable.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

I have tested out cross-lingual models which increase alignments for lower resource languages by putting them in the same folder, however, this can take up a large amount of extra space and would be easier if it was simply linkable.

Additional context Add any other context or screenshots about the feature request here.

mmcauliffe commented 1 year ago

You can use symbolic links in the corpus directory to the folder containing the data elsewhere rather than copying the data over, the corpus parsers follow symbolic links.

See https://learn.microsoft.com/en-us/windows/security/threat-protection/security-policy-settings/create-symbolic-links for Windows, https://kb.iu.edu/d/abbe for unix