pollen-robotics / dtw

DTW (Dynamic Time Warping) python module
GNU General Public License v3.0
1.16k stars 233 forks source link

Dist in seconds ? #15

Closed akofman closed 7 years ago

akofman commented 7 years ago

Hello and thanks for this library. Sorry if it's a silly question, I'm just discovering what Dynamic time warping is ... I'd just like to know if it's possible to get the delay in seconds between two same audio signals.

I tried this :

tmp1 = convert('test.wav', 0)
tmp2 = convert('test.wav', 1)

y1, sr1 = librosa.load(tmp1)
y2, sr2 = librosa.load(tmp2)

mfcc1 = librosa.feature.mfcc(y1,sr1)   #Computing MFCC values
mfcc2 = librosa.feature.mfcc(y2, sr2)

dist, cost, acc, path = fastdtw(mfcc1.T, mfcc2.T, dist='euclidean')
print("The delay between the two signals is: ",dist) 

where convert is based on ffmpeg in order to create the delay:

def convert(afile, start):
    tmp = tempfile.NamedTemporaryFile(mode='r+b', prefix='offset_', suffix='.wav')
    tmp_name = tmp.name
    tmp.close()
    psox = Popen([
        'ffmpeg', '-loglevel', 'panic', '-i', afile,
        '-ac', '1', '-ar', '8000', '-ss', str(start),
        '-acodec', 'pcm_s16le', tmp_name
    ], stderr=PIPE)
    psox.communicate()
    if not psox.returncode == 0:
        raise Exception("FFMpeg failed")
    return tmp_name

I expected a result equals to 1sec but the dist doesn't seem to be what I think it is :/

pierre-rouanet commented 7 years ago

Hi,

Actually the DTW is used to compute a time-stretched distance. The idea is if you have two identical signals like in your example but one shifted, DTW will still give you a very low distance because it allows for suppression.

I guess you could use DTW to find the shift (using a cost) but in your case a cross correlation should be better suited.

akofman commented 7 years ago

Thanks for the explanation ! Indeed cross correlation is better suited, that's exactly what I did in the end.

Merci encore et bonne journée ;)