Dual-path-RNN-Pytorch

Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation implemented by Pytorch

If you have any questions, you can ask them through the issue.

If you find this project helpful, you can give me a star generously.

Demo Pages: Results of pure speech separation model

Plan

[x] 2020-02-01: Reading article “Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation”. Zhihu Article link "阅读笔记”Dual-path RNN for Speech Separation“". Blog Article link "阅读笔记《Dual-path RNN for speech separation》". Both articles are interpretations of the paper. If you have any questions, welcome to discuss with me
[x] 2020-02-02: Complete data preprocessing, data set code. Dataset Code: /data_loader/Dataset.py
[x] 2020-02-03: Complete Conv-TasNet Framework (Update /model/model.py, Trainer_Tasnet.py, Train_Tasnet.py)
[x] 2020-02-07: Complete Training code. (Update /model/model_rnn.py) and Test parameters and some details are being adjusted.
[x] 2020-02-08: Fixed the code's bug.
[x] 2020-02-11: Complete Testing code.

Dataset

We used the WSJ0 dataset as our training, test, and validation sets. Below is the data download link and mixed audio code for WSJ0.

Training

Training for Conv-TasNet model

First, you need to generate the scp file using the following command. The content of the scp file is "filename && path".
```
python create_scp.py
```
Then you can modify the training and model parameters through "config/Conv_Tasnet/train.yml".
```
cd config/Conv-Tasnet
vim train.yml
```
Then use the following command in the root directory to train the model.
```
python train_Tasnet.py --opt config/Conv_Tasnet/train.yml
```
Training for Dual Path RNN model
First, you need to generate the scp file using the following command. The content of the scp file is "filename && path".
```
python create_scp.py
```
Then you can modify the training and model parameters through "config/Dual_RNN/train.yml".
```
cd config/Dual_RNN
vim train.yml
```
Then use the following command in the root directory to train the model.
```
python train_rnn.py --opt config/Dual_RNN/train.yml
```

Inference

Conv-TasNet

You need to modify the default parameters in the test_tasnet.py file, including test files, test models, etc.

For multi-audio

python test_tasnet.py

For single-audio

python test_tasnet_wav.py

Dual-Path-RNN

You need to modify the default parameters in the test_dualrnn.py file, including test files, test models, etc.

For multi-audio

python test_dualrnn.py

For single-audio

python test_dualrnn_wav.py

Pretrain Model

Conv-TasNet

Conv-TasNet model

Dual-Path-RNN

Dual-Path-RNN model

Result

Conv-TasNet

Final Results: 15.8690 is 0.56 higher than 15.3 in the paper.

Dual-Path-RNN

Final Results: 18.98 is 0.1 higher than 18.8 in the paper.

Reference

Luo Y, Chen Z, Yoshioka T. Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation[J]. arXiv preprint arXiv:1910.06379, 2019.
Conv-TasNet code && Dual-RNN code

JusperLee / Dual-Path-RNN-Pytorch

readme

Dual-path-RNN-Pytorch

Plan

Dataset

Training

Training for Conv-TasNet model

Training for Dual Path RNN model

Inference

Conv-TasNet

For multi-audio

For single-audio

Dual-Path-RNN

For multi-audio

For single-audio

Pretrain Model

Conv-TasNet

Dual-Path-RNN

Result

Conv-TasNet

Dual-Path-RNN

Reference