IntelLabs / bayesian-torch

A library for Bayesian neural network layers and uncertainty estimation in Deep Learning extending the core of PyTorch
BSD 3-Clause "New" or "Revised" License
482 stars 67 forks source link

Regression tasks: Epistemic and Aleatoric Uncertainty Estimation #22

Open feracero opened 1 year ago

feracero commented 1 year ago

Hello, I am trying to use this repository for regression tasks (I can see the examples seem to focus on classification tasks).

I would like to do estimate epistemic and aleatoric uncertainty for my bayesian neural network as described in 3.1 here https://arxiv.org/pdf/2204.09308.pdf

Could you please provide some guidance on how to obtain the mean and variance used by the final layer to generate the output samples? In this way one could estimate epistemic uncertainty for regression tasks.

Thank you!

famura commented 4 months ago

Hi @feracero, I am currently also thinking about using this repo for regression tasks. Did you have success?

@ranganathkrishnan (and other contributors) would it be possible to add an example? One thing that I am not sure of is how to modify the loss computation which is exemplified for classification in the Training snippet section.

giulioturrisi commented 2 months ago

@famura, did you manage to perform a regression with this repo? or did you end up with using another one? (if yes, please let me know which :D)

famura commented 2 months ago

@giulioturrisi I was putting it off due to other projects. It is still on my plate to try it within the next 1-2 months though. What is your experience?

giulioturrisi commented 2 months ago

@famura I have just started now to look around for libraries actually. If i find something nice, I will ping you.

ranganathkrishnan commented 2 months ago

Hi @feracero, I am currently also thinking about using this repo for regression tasks. Did you have success?

@ranganathkrishnan (and other contributors) would it be possible to add an example? One thing that I am not sure of is how to modify the loss computation which is exemplified for classification in the Training snippet section.

Hi @famura It should be straightforward to use model with LinearReparameterization layers with torch.nn.MSELoss() for regression task. I will add an example for regression in the repo.

famura commented 1 month ago

Nice, thank you @ranganathkrishnan.

Is there a specific reason why only LSTMs and not GRUs or RNNs are supported here? Or in other words, why did you have to re-code the LSTM forward pass here instead of using the one from PyTroch?

Update: I think my questions can be answered with "Because we need the KL from the layers that make up the LSTM"

ranganathkrishnan commented 1 month ago

Nice, thank you @ranganathkrishnan.

Is there a specific reason why only LSTMs and not GRUs or RNNs are supported here? Or in other words, why did you have to re-code the LSTM forward pass here instead of using the one from PyTroch?

Update: I think my questions can be answered with "Because we need the KL from the layers that make up the LSTM"

Hi @famura, No specific reason, we included implementation of reference Bayesian LSTM for time-series prediction tasks, Contributions are welcome through PRs. If you end up implementing Bayesian GRU and RNN layers, please send the pull request. Thanks!

staco-tx-mli commented 3 weeks ago

Hi @feracero, I am currently also thinking about using this repo for regression tasks. Did you have success? @ranganathkrishnan (and other contributors) would it be possible to add an example? One thing that I am not sure of is how to modify the loss computation which is exemplified for classification in the Training snippet section.

Hi @famura It should be straightforward to use model with LinearReparameterization layers with torch.nn.MSELoss() for regression task. I will add an example for regression in the repo.

Hello, i am currently also looking into using this library for a regression task. How does one weight the KL divergence compared to the MSE loss? In my case, i have multple outputs where all outputs are standardized (μ = 0, σ = 1) based on the training data.