prosodylab / Prosodylab-Aligner

Python interface for forced audio alignment using HTK and SoX
http://prosodylab.org/tools/aligner/
MIT License
331 stars 77 forks source link

Corpus class: loading model separately from input data #73

Closed naktinis closed 4 years ago

naktinis commented 5 years ago

I would like to be able to pre-load the trained model into the Corpus class once (for performance reasons mostly, as it takes ~2 seconds) and reuse it with multiple inputs.

If I understand correctly, Corpus class's __init__ method loads both the model and the input data at once. One easy option without breaking backwards compatibility would be to move input loading into a separate method and check if dirname is passed into the constructor before running it.

For example, here's one way to implement this:

screen shot 2018-09-17 at 17 14 38

Do you think this is something that makes sense? Or maybe I am missing some functionality that already makes such model reuse possible.

kylebgorman commented 5 years ago

I've never wanted to do that yet but it is absolutely sensible if you'd use it. In general Corpus.init is way too long and should probably be broken into a bunch of method calls.

kylebgorman commented 4 years ago

While I agree this would be a good enhancement, I don't have the time or energy to implement it myself. Feel free to reopen if you generate a PR ;)