jixinya / EVP

Code for paper 'Audio-Driven Emotional Video Portraits'.
291 stars 49 forks source link

About Dynamic Timing Warping preprocess mfcc features #6

Open DaddyJin opened 3 years ago

DaddyJin commented 3 years ago

Given two audio features of shape [N1, 28, 12] and [N2, 28, 12], could you please show me the demo code of align them using DTW methods? When training the disentanglement module, lots of audio features are used. Did you align all of them to the minimum length? I am really curious about the details of audio feature preprocessing. Thank you for your great work and looking forward to your reply!

cjerry1243 commented 2 years ago

I'm also interested in this process. @jixinya Do you have a plan to release the audio disentanglement module?

jixinya commented 2 years ago

I have released the training code. More details of DTW can be seen in train/disenetanglement/dtw/MFCC_dtw.py.

pfeducode commented 1 year ago

我已经发布了培训代码。DTW 的更多细节可以在 train/disenetanglement/dtw/MFCC_dtw.py 中看到。

how to generate landmark file