ujscjj / DPTNet

100 stars 23 forks source link

DPTNet

A PyTorch implementation of dual-path transformer network (DPTNet) based speech separation on wsj0-2mix described in the paper "Dual-Path Transformer Network: Direct Context-Aware Modeling for End-to-End Monaural Speech Separation", which has been accepted by Interspeech2020.

This implementation is based on DPRNN, thanks Yi Luo and ShiZiqiang for sharing.

File description:

optimizer_dptnet.py: a simple wrapper class for learning rate scheduling

transformer_improved.py: a PyTorch implementation of the improved transformer in the paper

dpt_net.py: where you can start

We obtain SDR 20.6 dB on wsj0-2mix and 16.8 dB on LS-2mix dataset.

References

https://github.com/ShiZiqiang/dual-path-RNNs-DPRNNs-based-speech-separation

https://github.com/kaituoxu/Speech-Transformer/blob/master/src/transformer/optimizer.py

https://github.com/pytorch/pytorch/blob/eace0533985641d9c2f36e43e3de694aca886bd9/torch/nn/modules/transformer.py