kundajelab / deeplift

Public facing deeplift repo
MIT License
827 stars 164 forks source link

Data feeding into the CNN model should be normalised or not? #111

Open Yanjiayork opened 4 years ago

Yanjiayork commented 4 years ago

Hello there,

I have checked the notebook examples, such as the MNIST example. The test data set you feed into the CNN model is not normalised (it is still between o and 255). Does that mean the CNN model you loaded is trained using the not normalised data or in the library it has an implicit normalised function? I have a model trained on the normalised data, so should I use the original test data or normalised data to feed to the normal?

Thanks

Yan

AvantiShri commented 4 years ago

Hi @Yanjiayork, you are correct that the model that I loaded was not trained on normalized data. If you have a model trained on normalized data, you should definitely normalize the data when using your model.

-Avanti

Yanjiayork commented 4 years ago

Hi Avanti,

Thank you very much. That is very helpful. I have another question is about calculating the contribution scores. I understand task_idx represents the index of the node in the output layer that we wish to compute scores. In my case, I only have one node, i.e. sigmoid. Does this mean the calculated contribution score is against whatever the output is? In other words, the contribution score is not against class 0 or 1 specifically? If so, is it possible to know the contribution score against class 0 and against class 1 respectively without changing the output layer to 2 nodes?

Many thanks Yan

AvantiShri commented 4 years ago

Hi Yan,

A positive contribution to the logit of the sigmoid can be interpreted as a positive contribution to class 1 (which is equivalent to a negative contribution to class 0). Similarly, a negative contribution to the logit of the sigmoid can be interpreted as a positive contribution to class 0 (which is equivalent to a negative contribution to class 1). To see this mathematically, note the equivalence between a sigmoid and a two-class softmax:

Does that make sense?