yxlu-0102 / MP-SENet

Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement
MIT License
293 stars 44 forks source link

NS and BE(SR) in one model design #31

Closed xxoospring closed 2 weeks ago

xxoospring commented 4 months ago

Is it possible to change the multiplicative design of the mask for noise suppression/dereverb to a residual design like the band-width extension, e.g. replace the activation function of the last layer of the amplitude spectrum decoder with prelu or leakyrelu?

yxlu-0102 commented 4 months ago

In speech bandwidth extension, the residual design is typically applied to the logarithmic amplitude spectrum, which is equivalent to masking the amplitude spectrum. While it's theoretically possible to directly apply the residual design to the amplitude spectrum, why don't just use the amplitude spectrum mapping or the complex spectrum mapping, which are both common practices.