Data feeding into the CNN model should be normalised or not?

Yanjiayork commented 4 years ago

Hello there,

I have checked the notebook examples, such as the MNIST example. The test data set you feed into the CNN model is not normalised (it is still between o and 255). Does that mean the CNN model you loaded is trained using the not normalised data or in the library it has an implicit normalised function? I have a model trained on the normalised data, so should I use the original test data or normalised data to feed to the normal?

Thanks

Yan

AvantiShri commented 4 years ago

Hi @Yanjiayork, you are correct that the model that I loaded was not trained on normalized data. If you have a model trained on normalized data, you should definitely normalize the data when using your model.

-Avanti

Yanjiayork commented 4 years ago

Hi Avanti,

Thank you very much. That is very helpful. I have another question is about calculating the contribution scores. I understand task_idx represents the index of the node in the output layer that we wish to compute scores. In my case, I only have one node, i.e. sigmoid. Does this mean the calculated contribution score is against whatever the output is? In other words, the contribution score is not against class 0 or 1 specifically? If so, is it possible to know the contribution score against class 0 and against class 1 respectively without changing the output layer to 2 nodes?

Many thanks Yan

AvantiShri commented 4 years ago

Hi Yan,

A positive contribution to the logit of the sigmoid can be interpreted as a positive contribution to class 1 (which is equivalent to a negative contribution to class 0). Similarly, a negative contribution to the logit of the sigmoid can be interpreted as a positive contribution to class 0 (which is equivalent to a negative contribution to class 1). To see this mathematically, note the equivalence between a sigmoid and a two-class softmax:

For a sigmoid with a logit of x, the probability of class 1 is: 1/(1 + e^-x)
For a softmax where the logit for class 0 is x0 and the logit for class 1 is x1, the probability of class 1 is e^x1 / (e^x1 + e^x0)
Note that 1/(1 + e^-x) = (e^0 / (e^0 + e^-x)). Thus, a sigmoid with a logit of "x" is equivalent to a softmax where the logit for class 0 is -x and the logit for class 1 is "0".
Also note that (e^0 / (e^0 + e^-x)) = (e^x / (e^x + e^0)). Thus, a sigmoid with a logit of "x" is ALSO equivalent to a softmax where the logit for class 0 is "0" and the logit for class 1 is "x".

Does that make sense?

kundajelab / deeplift

Data feeding into the CNN model should be normalised or not? #111