Closed ivallesp closed 8 years ago
Hi Iván,
regarding your first question, since you perform a softmax function on your output layer, the values have to sum up to 1 for each row. That is why you need 2 outputs. It is a little unintuitive but you figured out what to do.
Regarding roc auc, there is no straightforward implementation of that cost function, since you cannot really differentiate it, which is necessary to perform backprop. There are proxy metrics for roc auc, but from my experience, they are not worth the trouble. The main reason is because they are much more unstable than cross-entropy and did not lead to other results at the end.
Hope that helps
Hi,
It helps a lot, thank you. So I need 2 outputs because I am using softmax? I tried to fix output_nonlinearity to None and the same error raises… Can you modify my example so that it works with only one output unit?
Thank you!! Iván
El 21 nov 2015, a las 18:16, Benjamin Bossan notifications@github.com escribió:
Hi Iván,
regarding your first question, since you perform a softmax function on your output layer, the values have to sum up to 1 for each row. That is why you need 2 outputs. It is a little unintuitive but you figured out what to do.
Regarding roc auc, there is no straightforward implementation of that cost function, since you cannot really differentiate it, which is necessary to perform backprop. There are proxy metrics for roc auc, but from my experience, they are not worth the trouble. The main reason is because they are much more unstable than cross-entropy and did not lead to other results at the end.
Hope that helps
— Reply to this email directly or view it on GitHub https://github.com/dnouri/nolearn/issues/178#issuecomment-158662968.
I don't think it is possible to make it run with just one output unit, at least not with theano's included cross-entropy function. But you could define your own cost function that accepts 1D input, though I don't see why you absolutely need 1D.
Well, I don't absolutely need 1D, but I think it's the most correct for a binary classification problem. Correct me if I'm wrong but it would be better if we had only one output because we could define a probability of getting a 1 in the output, and the probability of 0 would be easily calculated as 1-the probability of 1. Am I wrong?
You are right that the second column is redundant but then I believe that sklearn also returns two columns for binary classifications tasks, so nolearn is in line there.
@BenjaminBossan just to clarify. If we are trying to do binary classification for one class, which class does the first probability in a row of softmax predictions represent? p(x=0) or p(x=1)?
Nevermind, I believe I have figured it out. The first prediction in the output is p(x=0) with the second one being p(x=1).
Hello
My name is Iván, I'm stuck from several days ago with the problem I'm going to describe. I'm following the Daniel Nouri's tutorial about deep learning: http://danielnouri.org/notes/category/deep-learning/ and I tried to adapt his example to a classification dataset. My problem here is that if I treat the dataset as a regression problem, it works properly, but if I try to perform a classification, it fails. I tried to write 2 reproducible examples.
1) Regression (it works well)
2) Classification (it raises an error of matrix dimensionalities; I paste it below)
The failed output I get with the code 2.
What is going on here? Am I doing something bad? I thing I tried everything but I am not able to figure out what is happening.
Note that I just updated today my dependencies using the command:
pip install -r https://raw.githubusercontent.com/dnouri/kfkd-tutorial/master/requirements.txt
Thanks in advance
Edit
I achieved to make it work by performing the subsequent changes but I still have some doubts:
y = Y.astype(np.int32)
but I still have some doubtsoutput_num_units=1
tooutput_num_units=2
and I'm not really sure of understanding that because I'm working with a binary classification problem and I think that this multilayer perceptron should have only 1 output neuron, not 2 of them... Am I wrong?I also tried to change the cost function to a ROC-AUC. I know there's a parameter called
objective_loss_function
which is defined asobjective_loss_function=lasagne.objectives.categorical_crossentropy
by default but... how can I use the ROC AUC as the cost function instead of the categorical crossentropy?Thanks