So you wanted me to believe that by simply putting an extra softmax layer with a learnable weight matrix you get a robust architecture against noisy labels??? Are you serious?
And in fact, it doesn't work at all. Is the method that gives me the worst performances. The only variant-method that seems to give better performances than by simply learning from noisy labels is the one that put a linear layer after the softmax with a non learnable matrix W (which is the estimation of the true transition matrix obtained by CleanLab).
Hi,
So you wanted me to believe that by simply putting an extra softmax layer with a learnable weight matrix you get a robust architecture against noisy labels??? Are you serious?
And in fact, it doesn't work at all. Is the method that gives me the worst performances. The only variant-method that seems to give better performances than by simply learning from noisy labels is the one that put a linear layer after the softmax with a non learnable matrix W (which is the estimation of the true transition matrix obtained by CleanLab).
Bye