ijindal / Noisy_Dropout_regularization

11 stars 1 forks source link

Method and Results #2

Open xnejed07 opened 6 years ago

xnejed07 commented 6 years ago

Hi, I have read your paper with method description. I found this paper really interesting and have several theoretical questions. First of all, since the noise matrix is unconstrained (in our case it has usually negative values), how do you extract normalized (0,1) values that are described in figures. Do you apply softmax on each row? Secondly, how is your model behaving when applied to the correct labels without noise?

ijindal commented 5 years ago

Hi, Thanks for reading this work.

Yes, We apply Softmax on each row to normalize the learned noise matrix.

Correct Labels or 0% noise, This model learns a very pessimistic noise model (an aggressive dropout). Therefore, we find this model not doing good on correctly labeled datasets. We have developed a new model to tackle all kinds and types of label noise and is under review.

Hope this helps.

CompareSan commented 2 years ago

In the paper you don't say that you apply the softmax to every row. You say quite the opposite, quote: "the matrix W is unconstrained during optimization. Because the softmax layer implicitly normalizes the resulting conditional probabilities, there is no need to normalize W or force its entries to be nonnegative. This simplifies the optimization process by eliminating the normalization step described above."

CompareSan commented 2 years ago

You don't even say anything about the initialization of the weight matrix W. If you initialize it randomly, good luck with the convergence.