IntelLabs / bayesian-torch

A library for Bayesian neural network layers and uncertainty estimation in Deep Learning extending the core of PyTorch
BSD 3-Clause "New" or "Revised" License
518 stars 72 forks source link

number of MC samples for training and inference #31

Closed dzy666fly closed 9 months ago

dzy666fly commented 11 months ago

Hi! Thank you so much for sharing so exciting library for Bayesian Deep Learning again. Now I am wondering how to choose the proper number of MC samples, should I choose 1, 2 or even more in training process, and why? The insufficient GPU computing resources pushes me to think this problem carefully. For example, I notice that we can choose 1 for training and 20 for inference by default in main_bayesian_mnist.py. So is it empirical? Looking forward to your reply.

ranganathkrishnan commented 10 months ago

@dzy666fly Thank you for using the library. During model training, using 1 MC sample may be sufficient as we can effectively compute ELBO (evidence lower bound) by estimating the KL divergence between Gaussian distributions directly from variational parameters (without MC samples) through approximations (https://github.com/IntelLabs/bayesian-torch/blob/main/bayesian_torch/layers/base_variational_layer.py#L53). For inference, the number of MC samples depends on the model/dataset or downstream task. The best way to empirically choose number of MC samples is beyond which the model performance metric (e.g., accuracy or ECE) saturates on a validation dataset. I hope this helps!

dzy666fly commented 9 months ago

Thank u so much for replying! I'm so sorry that I forget to check my e-mail these days. As you say, I tried different MC samples 1,2,3 for my task, and the success rate is almost same, maybe little difference on sample efficiency in training phase.