breizhn / DTLN

Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.
MIT License
585 stars 161 forks source link

Pesq score and data duration #53

Open Fangbo0506 opened 2 years ago

Fangbo0506 commented 2 years ago

My pesq can only reach 2.89 on the 50h dataset, which is inconsistent with the author's offer. But on 500h it can reach the author's 3.04.

breizhn commented 2 years ago

The the setup for the 50h is not shared here directly. There was some augmentation involved in creating these models which is not part of this codebase. It was basically just randomising the order of speech and noise files and randomising the SNR inside the training pipeline. Also 4s samples in a batch of 16 were used instead of 10s in a batch of 32.