xiph / rnnoise

Recurrent neural network for audio noise reduction
BSD 3-Clause "New" or "Revised" License
4.05k stars 895 forks source link

Please can you explain the loss function and how its derived in a little more detail #144

Open GSC-30212 opened 4 years ago

GSC-30212 commented 4 years ago

Hi @xiphmont , @pyu1538 , @jmvalin ,

Please can you explain the mathematics present in the below functions (Except binary_crossentropy) and how these were derived (logic behind using these):-

def my_crossentropy(y_true, y_pred): return K.mean(2K.abs(y_true-0.5) K.binary_crossentropy(y_pred, y_true), axis=-1)

def mymask(y_true): return K.minimum(y_true+1., 1.)

def msse(y_true, y_pred): return K.mean(mymask(y_true) * K.square(K.sqrt(y_pred) - K.sqrt(y_true)), axis=-1)

def mycost(y_true, y_pred): return K.mean(mymask(y_true) (10K.square(K.square(K.sqrt(y_pred) - K.sqrt(y_true))) + K.square(K.sqrt(y_pred) - K.sqrt(y_true)) + 0.01*K.binary_crossentropy(y_pred, y_true)), axis=-1)

def my_accuracy(y_true, y_pred): return K.mean(2K.abs(y_true-0.5) K.equal(y_true, K.round(y_pred)), axis=-1)

To be more specific:- 1] Why '2*K.abs(y_true-0.5)' added to binary_crossentropy? 2] What is the purpose of masking? 3] How mycost is derived? 4] Why we are using msse as a performance metric?

Thanusan19 commented 3 years ago

For the question : 2] What is the purpose of masking? The masking set the loss function to "0" when y_true = 0.5. Thus when we are not sure about the ground true, the weights of the models are not tuned.