In the project, we used y_tilde_src1 and y_tilde_src2 as our prediction where
y_tilde_src1 = y_hat_src1 / (y_hat_src1 + y_hat_src2 + np.finfo(float).eps) self.x_mixed
y_tilde_src2 = y_hat_src2 / (y_hat_src1 + y_hat_src2 + np.finfo(float).eps) self.x_mixed
we also used tf.reduce_mean(tf.square(self.y_src1 - pred_y_src1) + tf.square(self.y_src2 - pred_y_src2) as our loss.
However, I found after sound data preprocessing,
self.x_mixed != self.y_src1 + self.y_src2 (the left side and the right side usually differ a lot)
which means that even a perfect model will not result in a loss of zero. This is probably the reason why during training my training loss will never go below 3 or 4 using this mask.
Could you please explain why this will work in your case? Thank you very much.
Hello andabi,
In the project, we used y_tilde_src1 and y_tilde_src2 as our prediction where y_tilde_src1 = y_hat_src1 / (y_hat_src1 + y_hat_src2 + np.finfo(float).eps) self.x_mixed y_tilde_src2 = y_hat_src2 / (y_hat_src1 + y_hat_src2 + np.finfo(float).eps) self.x_mixed we also used tf.reduce_mean(tf.square(self.y_src1 - pred_y_src1) + tf.square(self.y_src2 - pred_y_src2) as our loss.
However, I found after sound data preprocessing, self.x_mixed != self.y_src1 + self.y_src2 (the left side and the right side usually differ a lot) which means that even a perfect model will not result in a loss of zero. This is probably the reason why during training my training loss will never go below 3 or 4 using this mask.
Could you please explain why this will work in your case? Thank you very much.
Best regards,
Lei