Large memory allocations

libAudioFlux / audioFlux

A library for audio and music analysis, feature extraction.

MIT License

2.91k stars 121 forks source link

@tiesiogdvd The spectrum of 3 minutes is already huge. In business, it is generally obtained in real time, such as 128ms data every 32ms. For long audio, the business can perform segmentation processing, such as calculating it every 2-3 seconds, and finally splicing them together.

import audioflux as af
import numpy as np

data_arr, sr = af.read(af.utils.sample_path('220'))
obj = af.MelSpectrogram(num=128, samplate=sr, radix2_exp=12, slide_length=1024)
seg_size = int(0.512 * sr)  # segment length

mel_data_list = []
for i in range(0, len(data_arr), seg_size):
    start_idx = max(0, i - obj.fft_length + obj.slide_length)
    end_idx = i + seg_size
    _data_arr = data_arr[start_idx:end_idx]
    if len(_data_arr) < obj.fft_length:
        break
    feature = obj.spectrogram(_data_arr)
    feature = af.utils.power_to_db(feature)
    mel_data_list.append(feature)

mel_data_arr = np.hstack(mel_data_list)

libAudioFlux / audioFlux

Large memory allocations #27