sp-uhh / sgmse

Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation
MIT License
455 stars 70 forks source link

Multichannel input does not work #1

Closed splinter21 closed 2 years ago

splinter21 commented 2 years ago

Traceback (most recent call last): File "enhancement.py", line 74, in write(join(target_dir, filename), x_hat.cpu().numpy(), 16000) File "/mnt/storage00/liujing04/anaconda3/envs/t12/lib/python3.6/site-packages/soundfile.py", line 315, in write subtype, endian, format, closefd) as f: File "/mnt/storage00/liujing04/anaconda3/envs/t12/lib/python3.6/site-packages/soundfile.py", line 629, in init self._file = self._open(file, mode_int, closefd) File "/mnt/storage00/liujing04/anaconda3/envs/t12/lib/python3.6/site-packages/soundfile.py", line 1184, in _open "Error opening {0!r}: ".format(self.name)) File "/mnt/storage00/liujing04/anaconda3/envs/t12/lib/python3.6/site-packages/soundfile.py", line 1357, in _error_check raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace')) *** RuntimeError: Error opening 'xxx.wav': Format not recognised.

(Pdb) x_hat.shape torch.Size([2, 1067315])

Using soundfile I can only write the audio file with the shape of (1,xxxxx).

cobalamin commented 2 years ago

Could you try transposing the array given to soundfile.write, i.e. x_hat.cpu().numpy().T?

I’m guessing you’re running your own training and evaluation experiments, otherwise please note that this work, repo and especially the pretrained checkpoints are designed for single-channel audio only.