tensorflow / probability

Probabilistic reasoning and statistical analysis in TensorFlow
https://www.tensorflow.org/probability/
Apache License 2.0
4.27k stars 1.1k forks source link

Which objective functions does TensorFlow Probability provide? #740

Open nbro opened 4 years ago

nbro commented 4 years ago

To train a probabilistic neural network (PNN), the ELBO loss is usually used. The ELBO loss is composed of the KL loss between the prior and the variational posterior distributions, for each layer, and the likelihood loss, which currently needs to be implemented by the programmer. However, it may be handy to have some default implementations of this likelihood loss (similarly to the TF's or Keras' implementations of the cross-entropy, MSE, etc., losses).

Am I missing something? Why didn't you provide the default implementations of this loss?

However, note that I've already implemented this likelihood loss (that is part of the ELBO) as a custom loss, but I think it would be nice if TFP provided a default implementation of this loss. You can actually train a PNN with the cross-entropy loss, but the PNN does not learn.

brianwa84 commented 4 years ago

Typically the likelihood portion will be computed from the output of the network against the labels, ie.

m = tf.keras.Sequential([ ..., tfp.layers.IndependentBernoulli(..) ]) d = m(features) nll = -d.log_prob(labels)

Is that what you mean?

On Thu, Jan 23, 2020, 9:45 AM nbro notifications@github.com wrote:

To train a probabilistic neural network (PNN), the ELBO loss is usually used. The ELBO loss is composed of the KL loss between the prior and the variational posterior distributions, for each layer, and the likelihood loss, which needs to be implemented by the programmer. However, it may be handy to have some default implementations of this likelihood loss (similarly to the TF's or Keras' implementations of the cross-entropy, MSE, etc., losses). Am I missing something? Why didn't you provide the default implementations of these losses (at least, the most commonly used ones)?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/probability/issues/740?email_source=notifications&email_token=AFJFSI6BAG2CUWP6MGSBAADQ7GUQ3A5CNFSM4KKXTLT2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IIIOZSA, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFJFSI7OCD6JUIGYBV3PXE3Q7GUQ3ANCNFSM4KKXTLTQ .

nbro commented 4 years ago

@brianwa84 Yes, exactly, that's what I mean. Your nll could have a (simple) default implementation, which works with different distributions, etc., and that can be passed to Keras' compile or added to the metrics list of the same compile method. As I said, a custom loss can be easily implemented (i.e. something similar to what you did) and passed to the compile method, but this loss may be so common that it may be worth adding a default implementation to TFP.

brianwa84 commented 4 years ago

I guess I'd be concerned that doing so would mask what's already a quite simple one-liner behind a facade, making it more imposing to a reader. What do you think the user code might look like in an ideal world?

Brian Patton | Software Engineer | bjp@google.com

On Thu, Jan 23, 2020 at 10:36 AM nbro notifications@github.com wrote:

@brianwa84 https://github.com/brianwa84 Yes, exactly, that's what I mean. Your nll could have a (simple) default implementation, so that it works with different distributions, etc.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tensorflow/probability/issues/740?email_source=notifications&email_token=AFJFSI2HV2LM7DOPOQYSGADQ7G2P7A5CNFSM4KKXTLT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJXYUJI#issuecomment-577735205, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFJFSIYCQGTAX57AFDO2MATQ7G2P7ANCNFSM4KKXTLTQ .

nbro commented 4 years ago

@brianwa84 The Keras' or even the TF implementations of certain loss functions may also be simple (honestly, I haven't looked at their implementation yet). For example, people could forget the minus in front of it. By providing a default implementation, these mistakes would not occur. Of course, Bayesian deep learning is still a relatively new field, so things may change meanwhile.

In an ideal world, I would only need to specify the loss function that will be used to train the Bayesian NN, without having to implement it. I should only need to implement it in the case it is a custom loss.