urinieto / harmonixset

The Harmonix Set: Beats, Downbeats, and Structural Annotations for Pop Music
MIT License
149 stars 24 forks source link

Window function of the melspectrom #14

Closed lzqlzzq closed 6 months ago

lzqlzzq commented 6 months ago

Thanks for the great dataset! I am trying to get the original waveform from the melspectrom so that I can utilize some pre-trained audio encoder. But I followed the info.json and librosa but it seems to have some weird periodic "envelope" in the inversed signal. I think that maybe related to window function of mel transform. Here is my code:

librosa.feature.inverse.mel_to_audio(mel_spec,
                                                sr=info['SR'],
                                                n_fft=info['N_FFT'],
                                                hop_length=info['HOP_LENGTH'],
                                                fmin=info['MEL_FMIN'],

I also tried many window function but none of one works. What is the proper way to inverse the melspectrom into signal?

urinieto commented 6 months ago

Hi there! Unfortunately computing the inverse melspectogram to obtain waveforms is not supported by the license on which these melspectrograms were released (see the LICENSE file in the tgz for more info). If you need different melspec parameters, I can compute a new set of features for you.

lzqlzzq commented 6 months ago

Thanks for your rapid reply! I would like to use these setup for my higher time resolution usage:

librosa.feature.melspectrogram(y=y, sr=24000, n_fft=2048, hop_length=1024, window='hann', center=True, pad_mode='constant', power=2.0, n_mels=256, fmin=30, fmax=12000)
urinieto commented 6 months ago

@lzqlzzq , aplogies for the delay! Here you have the new features!

lzqlzzq commented 6 months ago

Much thanks!