piEsposito / blitz-bayesian-deep-learning

A simple and extensible library to create Bayesian Neural Network layers on PyTorch.
GNU General Public License v3.0
931 stars 106 forks source link

a question in example stock_blstm #17

Closed XingHYu closed 3 years ago

XingHYu commented 4 years ago

hi , I want to know what does the parameter "future_length" in stock_blstm mean?Thank you!

piEsposito commented 4 years ago

That parameter, in the auxiliary function, means how many steps in the future will the B-LSTM predict to before consult the real data. So, if we set it to, lets say, 5, then the B-LSTM will predict for five times in future an then consult the real data

XingHYu commented 4 years ago

When training the model, for example, we use 21 historical values, and predict the next step, that is, predict the 22nd value. When testing the model, if we set the parameter "future_length" to 5, we will use the historical 21 values and scroll forward to predict 5 steps, that is, predict the 22nd, 23rd, 24th, 25th, and 26th values. Can I understand it this way?

piEsposito commented 4 years ago

Yes, that's correct.

XingHYu commented 4 years ago

yeah I see. Another question I want to discuss is why the Bayesian lstm has such obvious hysteresis in the prediction of time series data, and the prediction results show better trend accuracy performance. Compared with the deterministic weight lstm, the hysteresis seems to be more obvious. I tried to optimize the training parameters of Bayesian lstm, but still failed to solve this problem, so I would like to ask if there is a solution?And, when I use your library to make multi-step forward one-time prediction, I do n’t know how to define the variance of the multi-step prediction value.

piEsposito commented 4 years ago

I can't give you a definitive answer, but some hypothesis: (i) It could be happening due to weight initialization - it is pretty heuristic but trying new scales and locs for the normal or using xavier may give you different results; (ii) It may have happened due to the data introduced (if you input the first timestamps padded with zeroes, it may give you some lag to assert the state to gather the predictions);

To gather variance you can use the MFVI 'main' technique: perform the forward passes and take the mean and variance of the predictions you want the same way you would do if you were using non recurrent BNNs, as we sample the weights one time per call.

We could, of course, try sampling new weights for each step on the time series feedforward operation, but it not be computationally viable.

piEsposito commented 3 years ago

Closing due to inactivity.