js05212 / PyTorch-for-NPN

Officially unofficial PyTorch code for the NIPS paper 'Natural-Parameter Networks: A Class of Probabilistic Neural Networks'
10 stars 1 forks source link

MNIST classification #1

Open janisgp opened 5 years ago

janisgp commented 5 years ago

I am curious how the classification setting works. You mention in your paper that you use the cross entropy loss.

Do you use as final layer a softmax? How do you propagate the variance through the softmax?

js05212 commented 5 years ago

Hi,

Thanks for the interest and good question! For MNIST classification, we use elementwise sigmoid followed by cross entropy. The output mean of the sigmoid will take the mean and variance from the previous layer (pre-activation linear layer) as input. This is how both the mean and variance can affect the final prediction.

There has also been follow-ups of NPN (e.g., work from ICLR 2018 if I remember correctly) trying to extend it with softmax layer.

Hao