Porting from scipy.optimize.fmin_l_bfgs_b to tfp.optimizer.lbfgs_minimize

tensorflow / probability

Probabilistic reasoning and statistical analysis in TensorFlow

https://www.tensorflow.org/probability/

Apache License 2.0

4.26k stars 1.1k forks source link

Porting from scipy.optimize.fmin_l_bfgs_b to tfp.optimizer.lbfgs_minimize #1204

Closed VANRao-Stack closed 3 years ago

VANRao-Stack commented 3 years ago

I have been using scipy's bfgs implementation for my neural net, however I am intending to advantage of GPUs for the later part of my project, hence I would like to shift the optimizer to tfp's implementation of the same. What are the different things that I am supposed to pass in. I tried simply passing the func and x0 of scipy directly to tfp as value_and_gradients_function and intital_position, but that doesnt seem to work. Any idea on what modifications I am supposed to do?

ColCarroll commented 3 years ago

There's a nice helper function in tfp.math to help with this -- the code should probably end up looking like

tfp.optimizer.lbfgs_minimize(
  lambda x: tfp.math.value_and_gradient(f, x),
  initial_position=x0,
  ...)

Let me know if that works, or maybe provide some code of what you started on, and we can iterate!

VANRao-Stack commented 3 years ago

There's a nice helper function in tfp.math to help with this -- the code should probably end up looking like
tfp.optimizer.lbfgs_minimize(
  lambda x: tfp.math.value_and_gradient(f, x),
  initial_position=x0,
  ...)
Let me know if that works, or maybe provide some code of what you started on, and we can iterate!

Thank you so much for responding, but I think the main mistake I was doing was that I wasn't passing 1-D tensors for setting weights and stuff. I also had to set the backend to use float64 as default for it to work. I used dynamic stitch to do the 1-D tensor part, and it worked!