Open smlcitias opened 2 years ago
April 29th
May. 6th
DCCRN model | testing data | training data | STOI | SAR | SDR | SI_SNR |
---|---|---|---|---|---|---|
single speaker | single speaker | 0.92 | 11.98 | 11.98 | 11.42 | |
single speaker | single and 2 speakers | 0.92 | 12.01 | 12.01 | 11.49 | |
2 speaker mixture | single speaker | 0.84 | 7.30 | 7.30 | 6.51 | |
2 speaker mixture | single and 2 speakers | 0.91 | 10.76 | 10.76 | 10.35 |
ConvTasnet model from wsj0-2mix | approach | STOI | SAR | SDR | SIR | SI_SNR |
---|---|---|---|---|---|---|
SE + SS | 0.82 | 8.37 | 7.57 | 19.53 | 6.93 | |
SS | 0.69 | 0.60 | -0.94 | 10.30 | -1.39 |
ConvTasnet model from WHAM | dataset | STOI | SAR | SDR | SIR | SI_SNR |
---|---|---|---|---|---|---|
SE + SS | 0.84 | 9.35 | 8.76 | 21.62 | 8.03 | |
SS | 0.87 | 9.89 | 9.45 | 23.25 | 8.82 |
May. 20th
Speech Separation:
To train the model with 1 to 4 channels input mixture.
Multi-/Single-channel speech processing:
Plan to build OR-PIT recursive procedure in ESPnet-se
Target speech extraction
June. 10th
April 15th
Progress
Interspeech 2020 ESPnet-SE++
Created pull requests in ESPnet
https://github.com/espnet/espnet/pull/4264
https://github.com/espnet/espnet/pull/4268
https://github.com/espnet/espnet/pull/4269
Joint-training for iNeube and ASR/ST/SLU model
Preparing ICASSP videos
Universal speech separation and enhancement (SSE) model
Action items