This is the official implementation of “Separate and Reconstruct: Asymmetric Encoder-Decoder for Speech Separation” accepted in NeurIPS 2024 Paper Link(Arxiv)
🔥 October, 2024: We have uploaded the pre-trained models of our SepReformer-B for WSJ0-2MIX in models/SepReformer_Base_WSJ0/log/scratch_weight
folder! You can directly test the model using the inference command below.
🔥 September 2024, Paper accepted at NeurIPS 2024 🎉.
We are planning to release the other cases especially for partially or fully overlapped, noisy-reverberant mixture with 16k of sampling rates for practical application within this year.
We propose SepReformer, a novel approach to speech separation using an asymmetric encoder-decoder network.
Demo Pages: Sample Results of speech separation by SepReformer
If you want to train the network, you can simply trying by
run training as
python run.py --model SepReformer_Base_WSJ0 --engine-mode train
Simply evaluating a model without saving output as audio files
python run.py --model SepReformer_Base_WSJ0 --engine-mode test
Evaluating with output wav files saved
python run.py --model SepReformer_Base_WSJ0 --engine-mode test_wav --out_wav_dir '/your/save/directoy[optional]'
If you find this repository helpful, please consider citing:
@misc{shin2024separate,
title={Separate and Reconstruct: Asymmetric Encoder-Decoder for Speech Separation},
author={Ui-Hyeop Shin and Sangyoun Lee and Taehan Kim and Hyung-Min Park},
year={2024},
eprint={2406.05983},
archivePrefix={arXiv},
}