fgnt / nara_wpe

Different implementations of "Weighted Prediction Error" for speech dereverberation
MIT License
494 stars 164 forks source link

Help in running #73

Open WindowsNT opened 1 year ago

WindowsNT commented 1 year ago
import numpy as np
import soundfile as sf
from tqdm import tqdm
from nara_wpe.wpe import wpe
from nara_wpe.wpe import get_power
from nara_wpe.utils import stft, istft, get_stft_center_frequencies
from nara_wpe import project_root

stft_options = dict(size=512, shift=128)

channels = 2
sampling_rate = 48000
delay = 3
iterations = 5
taps = 10
alpha=0.9999

file_template = 'r:/reverb.wav'
signal_list = [
    sf.read(str(project_root / 'data' / file_template.format(d + 1)))[0]
    for d in range(channels)
]
y = np.stack(signal_list, axis=0)

Y = stft(y, **stft_options).transpose(2, 0, 1)

Z = wpe(
    Y,
    taps=taps,
    delay=delay,
    iterations=iterations,
    statistics_mode='full'
).transpose(1, 2, 0)
z = istft(Z, size=stft_options['size'], shift=stft_options['shift'])

from scipy.io import wavfile
wavfile.write('new_audio.wav', sampling_rate, z.T)
sf.write('new_audio.wav', z.T, sampling_rate)   

Result,


    Y = stft(y, **stft_options).transpose(2, 0, 1)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: axes don't match array

It looks nice, but I can't yet use it.

boeddeker commented 1 year ago

You changed filename in

file_template = 'AMI_WSJ20-Array1-{}_T10c0201.wav'
signal_list = [
    sf.read(str(project_root / 'data' / file_template.format(d + 1)))[0]
    for d in range(channels)
]

to

file_template = 'r:/reverb.wav'

without touching the signal_list.

I guess, the file is a multichannel file and hence the code complains, that the number of axes doesn't match. Could you check the shape of y and stft(y, **stft_options)?

WindowsNT commented 1 year ago

I will try it with a mono file and let you know.