facebookresearch / music-translation

A UNIVERSAL MUSIC TRANSLATION NETWORK - a method for translating music across musical instruments and styles.
Other
459 stars 71 forks source link

How did you extract pitch information to perform normalized cross-correlation evaluation in the paper? #3

Closed KinglittleQ closed 4 years ago

KinglittleQ commented 5 years ago

I know that you've used librosa pip-tracker to extract pitch but I'm confused by how to use this to get pitch information. pitches, magnitudes = librosa.piptrack(y=y, sr=sr, fmin=0, fmax=800) This function will return two arrays: pitches and magnitudes. magnitudes[f, t] contains the magnitude of bin f at time t and pitches[f, t] contains the instantaneous frequency of bin f at time t. Is it right to take the maximum magnitude frequency bin as the pitch? That is:

bins = np.argmax(magnitudes, axis=0)
p = [pitches[bins[t], t] for t in range(pitches.shape[1])]
p = np.array(p)

Or take the lowest frequency bin as the pitch(F0)?

pitches[pitches == 0] = np.inf
p = pitches.min(axis=0)
p[p == np.inf] = 0
adampolyak commented 4 years ago

We used the following script https://github.com/miromasat/pitch-detection-librosa-python for pitch extraction.