asteroid-team / asteroid

The PyTorch-based audio source separation toolkit for researchers
https://asteroid-team.github.io/
MIT License
2.21k stars 419 forks source link

refer to asteroid/dsp/overlap_add.py:31 LambdaOverlapAdd model use file_separate output audio file is All noise #676

Closed blackhu closed 12 months ago

blackhu commented 12 months ago

Before reporting a bug:

First, please search previous issues and the FAQ and be sure this hasn't been answered elsewhere.

🐛 Bug

  1. /overlap_add.py:8 : window="hanning" It should be window="hann"

  2. refer to asteroid/dsp/overlap_add.py:31 LambdaOverlapAdd model use file_separate output audio file is All noise

To Reproduce

Steps to reproduce the behavior (code sample and stack trace):

from asteroid.models import ConvTasNet
from asteroid.dsp.overlap_add import LambdaOverlapAdd
import soundfile as sf
import torch
import time
from asteroid.separate import file_separate

nnet = ConvTasNet(n_src=2)
nnet.cuda()
continuous_nnet = LambdaOverlapAdd(
    nnet=nnet,  # function to apply to each segment.
    n_src=2,  # number of sources in the output of nnet
    window_size=64000,  # Size of segmenting window
    hop_size=None,  # segmentation hop size
    window="hann", # Type of the window (see scipy.signal.get_window
    reorder_chunks=True,  # Whether to reorder each consecutive segment.
    enable_grad=False,  # Set gradient calculation on of off (see torch.set_grad_enabled)
)
continuous_nnet.cuda()
file_separate(continuous_nnet, "damo/speech_mossformer_separation_temporal_8k/examples/mix_speech1.wav")

mix_speech1.wav.zip

Expected behavior

Environment

Package versions

Run asteroid-versions and paste the output here:

0.6.0

Alternatively, if you cannot install Asteroid or have an old version that doesn't have the asteroid-versions script, please output the output of:

asteroid==0.6.0
asteroid-filterbanks==0.4.0
pytorch-lightning==1.7.7
pytorch-ranger==0.1.1
rotary-embedding-torch==0.2.3
torch==2.0.1
torch-optimizer==0.1.0
torch-stoi==0.1.2
torchaudio==2.0.2
torchmetrics==0.7.3
torchvision==0.15.2
Note: you may need to restart the kernel to use updated packages.

Additional info

Additional info (environment, custom script, etc...)

mpariente commented 12 months ago

With a random model, it's normal.

blackhu commented 12 months ago

I get it.

replace

nnet = ConvTasNet(n_src=2)

with

nnet = BaseModel.from_pretrained("mpariente/DPRNNTasNet-ks2_WHAM_sepclean")

That solves the problem