piEsposito / blitz-bayesian-deep-learning

A simple and extensible library to create Bayesian Neural Network layers on PyTorch.
GNU General Public License v3.0
918 stars 107 forks source link

Elbo loss converging but accuracy doesn't improve #77

Open pbischoff opened 3 years ago

pbischoff commented 3 years ago

I'm trying to train a bayesian LSTM to predict remaining useful lifetime using windows of ten samples with roughly 600 features. I previously trained a conventional LSTM in tensorflow and therefore rebuild the architecture in pytorch to be able to use blitz.

The problem is that when I train using the elbo loss the loss converges quickly (but not to zero, which is not a problem I suppose?) but the accuracy is not doing anyting. I also tried training using normal cross entropy loss, which works perfectly but I'm not sure if the 'bayesianity' is still valid?
Another attempt is to freeze the model for the first part of training and then unfreeze and continue training. In that case the model converges (using elbo loss) and the accuracy improves, but once I unfreeze and continue training the accuracy drops again.

Any experiences on that? Is there something I could change about the code? Is it even valid to check accuracy with bayesian networks? I think it should be because the network should still predict properly most times? (I'm completely new to bayesian nets though and might not have understood everything fully...)

gt2001 commented 3 years ago

I'm trying to train a bayesian LSTM to predict remaining useful lifetime using windows of ten samples with roughly 600 features. I previously trained a conventional LSTM in tensorflow and therefore rebuild the architecture in pytorch to be able to use blitz.

The problem is that when I train using the elbo loss the loss converges quickly (but not to zero, which is not a problem I suppose?) but the accuracy is not doing anyting. I also tried training using normal cross entropy loss, which works perfectly but I'm not sure if the 'bayesianity' is still valid? Another attempt is to freeze the model for the first part of training and then unfreeze and continue training. In that case the model converges (using elbo loss) and the accuracy improves, but once I unfreeze and continue training the accuracy drops again.

Any experiences on that? Is there something I could change about the code? Is it even valid to check accuracy with bayesian networks? I think it should be because the network should still predict properly most times? (I'm completely new to bayesian nets though and might not have understood everything fully...)

I'm suffering from similar problems too. But my situation is your totally reversed version---- the elbo loss doesn't drop, while the val-loss could drop to e-5 and give a good prediction. The loss before training is around 7. After many epochs, it's still around 6.8...... I'm using pytorch, if you think it's beneficial I can put my code here. This is so strange. By the way, I changed his code a little. The input is the 'Open', 'High', and 'Low' of the stock, output is still 'Close'.

CatNofishing commented 3 years ago

Me too, plaease see my demo https://github.com/piEsposito/blitz-bayesian-deep-learning/issues/79

pbischoff commented 3 years ago

@CatNofishing @gt2001 Could you confirm that your model is acting as expected when you freeze it?

CatNofishing commented 3 years ago

@frfritz sorry , i dont understand ' acting as expected when you freeze it' , you can try my deom and download my code and data

yongen9696 commented 2 years ago

@frfritz sorry , i dont understand ' acting as expected when you freeze it' , you can try my deom and download my code and data

I believe what @pbischoff meant about freeze is the function provided in _variationalestimator.py at line 73:

def freeze_model(self):
    """
    Freezes the model by making it predict using only the expected value to their BayesianModules' weights distributions
    """
    for module in self.modules():
        if isinstance(module, (BayesianModule)):
            module.freeze = True