readbeyond / aeneas

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
http://www.readbeyond.it/aeneas/
GNU Affero General Public License v3.0
2.48k stars 225 forks source link

Better DTW class to support different algorithms #143

Open readbeyond opened 7 years ago

readbeyond commented 7 years ago

File "dtw.py" should be refactored --- maybe renaming it to e.g. "aligner.py"? --- to support different algorithms in the future. For example, triangular global alignment kernels or variations on the DTW different than Sakoe-Chiba.

stefan-balke commented 7 years ago

Maybe you want to checkout our DTW implementation in librosa.

It uses numba to accelerate the alignment. Drawback: Relies on Anaconda. But might be easier to deploy than pure C.

readbeyond commented 7 years ago

Hi again,

I have been following librosa for a while now, and yes, what I had in mind when opening this issue is similar to the API of dtw() you mention.

However the dependency on numba is a bit too much than I would like to impose on aeneas users. In the past, I had scipy as dependency for the WAVE read/write functions, and users complained that having 80+ MB for just a few functions was too much, so eventually I pulled the relevant sources out. Hence, I am afraid I would not take this route (e.g. depend on librosa) any time soon.

For aeneas v2 I would like to modularize the package better, so that e.g. the MFCC extractor, or the DTW computation, or the eSpeak/Festival C extensions can be used separately and independently. Possibly, packaged separately as well. In that context, extending the current C code with the new extra functionalities would probably make sense:

In view of these thoughts, it would be interesting to know whether librosa might benefit (e.g., performance? Do you have benchmarks comparing pure C vs numba JIT-ed?) from having C extensions rather than JIT-ed functions for e.g. MFCC and DTW, assuming there is interest, of course. If so, I am open to collaborate on shared lightweight modules.

A final note: the cffi package might be a good alternative to the C Extension mechanism. I spent some time investigating it, but I got stuck for its limited support to headers (but things might have changed meanwhile). The PyPy guys heavily suggested looking at it.

On 03/23/2017 04:25 PM, Stefan Balke wrote:

Maybe you want to checkout our DTW implementation in |librosa|.

It uses |numba| to accelerate the alignment. Drawback: Relies on Anaconda. But might be easier to deploy than pure C.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/readbeyond/aeneas/issues/143#issuecomment-288755297, or mute the thread https://github.com/notifications/unsubscribe-auth/AFEodnzBKnPoLMYaNO7sRNLxun0Ppg67ks5roo7ggaJpZM4LNL0Y.

-- Alberto Pettarin

web: http://readbeyond.it/ web: http://www.albertopettarin.it/ twitter: http://twitter.com/acutebit/ skype: alberto_pettarin mobile: +39 340 82 18 704