How do you resample to 16000?

moshengmao commented 10 months ago

When I use VCTK-DEMAND dataset to test， I found the sample_rate of wavs in the VCTK-DEMAND/test/ is 48000, but evaluation.py

assert sr == 16000

so I add lines to resample,

def evaluation(model_path, noisy_dir, clean_dir, save_tracks, saved_dir):

clean_audio, sr = sf.read(clean_path)
clean_audio = librosa.resample(clean_audio, sr, 16000)
metrics = compute_metrics(clean_audio, est_audio, 16000, 0)

def enhance_one_track(model, audio_path, saved_dir, cut_len, n_fft=400, hop=100, save_tracks=False):

` noisy, sr = torchaudio.load(audio_path)
audio_path VCTK-DEMAND/test/noisy/p232_001.wav sr 48000
noisy_np = noisy.numpy()
noisy_resampled_np = librosa.resample(noisy_np, sr, 16000)
noisy = torch.tensor(noisy_resampled_np)
sr = 16000
noisy = noisy.cuda().to(device)

`

and generate some wavs. But the audio quality of the WAV file is very poor. It's hard to make out. How do you resample to 16000? Maybe my way to resample is wrong?

And the result is

pesq: 1.2306799195634508 csig: 1.6080942775665945 cbak: 2.1193723316105366 covl: 1.4202725754636616 ssnr: 0.6998261689532145 stoi: 0.6101097034995405

Johnsonabuse commented 2 months ago

Hello!I also meet this problem.I have modified the resample rate to 16000 and my result is similar to you.Have you solved this problem yet?

BancoLin commented 1 month ago

Use ffmpeg to downsample audio files.

GrantLau1226 commented 5 days ago

Hello!I also meet this problem.I have modified the resample rate to 16000 and my result is similar to you.Have you solved this problem yet?

pesq: 2.2732677546519677 csig: 3.7115074899207685 cbak: 2.772968618267093 covl: 3.022599602088685 ssnr: 2.0236725814308003 stoi: 0.8991248361647437

This is my results...

ruizhecao96 / CMGAN

How do you resample to 16000? #41

audio_path VCTK-DEMAND/test/noisy/p232_001.wav sr 48000