tensorflow / probability

Probabilistic reasoning and statistical analysis in TensorFlow
https://www.tensorflow.org/probability/
Apache License 2.0
4.27k stars 1.11k forks source link

Gradient of ODE solve is undefined #405

Closed shoyer closed 5 years ago

shoyer commented 5 years ago

I tried with tf-nightly and tfp-nightly (the tfp release does not yet include ode), and the ode solver is not differentiable with gradient tapes. When trying to differentiate the result of the solver, the gradient function returns None ( see below). Or maybe I am getting something wrong, and differentiation works differently in tensorflow-probability? I just used the solver from tfp, and the rest I left as ik works in pure tf.


import tensorflow_probability as tfp

def f(x, t):
    return 2*x

x_init = tf.ones((1))
t_list = tf.range(0, 10, 1.)

with tf.GradientTape(persistent=True) as tape:
    tape.watch(x_init)
    solver = tfp.math.ode.BDF()
    times, states, _,_ = solver.solve(f, initial_time=t_list[0], initial_state=x_init, solution_times=t_list)

diff = tape.gradient(states, x_init)
# diff is None
print(diff)

Originally posted by @seb1000 in https://github.com/tensorflow/tensorflow/issues/15833#issuecomment-491793025

csuter commented 5 years ago

I see no reason this shouldn't work as expected. I.e., there's nothing (intentionally) special/different in TFP that would result in this not working.

On Mon, May 13, 2019 at 7:42 AM Stephan Hoyer notifications@github.com wrote:

I tried with tf-nightly and tfp-nightly (the tfp release does not yet include ode), and the ode solver is not differentiable with gradient tapes. When trying to differentiate the result of the solver, the gradient function returns None ( see below). Or maybe I am getting something wrong, and differentiation works differently in tensorflow-probability? I just used the solver from tfp, and the rest I left as ik works in pure tf.

import tensorflow_probability as tfp

def f(x, t): return 2*x

x_init = tf.ones((1)) t_list = tf.range(0, 10, 1.)

with tf.GradientTape(persistent=True) as tape: tape.watch(xinit) solver = tfp.math.ode.BDF() times, states, ,_ = solver.solve(f, initial_time=t_list[0], initial_state=x_init, solution_times=t_list)

diff = tape.gradient(states, x_init)

diff is None

print(diff)```

Originally posted by @seb1000 in https://github.com/tensorflow/tensorflow/issues/15833#issuecomment-491793025

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/probability/issues/405?email_source=notifications&email_token=AABG2GKWZGZLN7CIWNHDANLPVF43ZA5CNFSM4HMQQOK2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GTOKUQQ, or mute the thread https://github.com/notifications/unsubscribe-auth/AABG2GJPZ6653LIDM2TY47DPVF43ZANCNFSM4HMQQOKQ .

-- Christopher Suter | SWE | cgs@google.com | 904-571-2468

brianwa84 commented 5 years ago

While loop with gradient tape requires v2 control flow. Are you using v2?

On Thu, May 16, 2019, 2:32 PM Christopher Suter notifications@github.com wrote:

I see no reason this shouldn't work as expected. I.e., there's nothing (intentionally) special/different in TFP that would result in this not working.

On Mon, May 13, 2019 at 7:42 AM Stephan Hoyer notifications@github.com wrote:

I tried with tf-nightly and tfp-nightly (the tfp release does not yet include ode), and the ode solver is not differentiable with gradient tapes. When trying to differentiate the result of the solver, the gradient function returns None ( see below). Or maybe I am getting something wrong, and differentiation works differently in tensorflow-probability? I just used the solver from tfp, and the rest I left as ik works in pure tf.

import tensorflow_probability as tfp

def f(x, t): return 2*x

x_init = tf.ones((1)) t_list = tf.range(0, 10, 1.)

with tf.GradientTape(persistent=True) as tape: tape.watch(xinit) solver = tfp.math.ode.BDF() times, states, ,_ = solver.solve(f, initial_time=t_list[0], initial_state=x_init, solution_times=t_list)

diff = tape.gradient(states, x_init)

diff is None

print(diff)```

Originally posted by @seb1000 in https://github.com/tensorflow/tensorflow/issues/15833#issuecomment-491793025

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://github.com/tensorflow/probability/issues/405?email_source=notifications&email_token=AABG2GKWZGZLN7CIWNHDANLPVF43ZA5CNFSM4HMQQOK2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GTOKUQQ , or mute the thread < https://github.com/notifications/unsubscribe-auth/AABG2GJPZ6653LIDM2TY47DPVF43ZANCNFSM4HMQQOKQ

.

-- Christopher Suter | SWE | cgs@google.com | 904-571-2468

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/probability/issues/405?email_source=notifications&email_token=AFJFSI4FFQINXA64W4GI5T3PVWSCRA5CNFSM4HMQQOK2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVSV3DI#issuecomment-493182349, or mute the thread https://github.com/notifications/unsubscribe-auth/AFJFSIZRDK2SOEWIJMYXHUDPVWSCRANCNFSM4HMQQOKQ .

shoyer commented 5 years ago

Good question, I don't think we're using this currently. Is it possible to opt into v2 control flow now, while retaining TensorFlow 1.x compatibility?

brianwa84 commented 5 years ago

There is an environment variable you can set prior to TF import. Or you can just money patch this bool https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/control_flow_util.py#L31

On Thu, May 16, 2019, 5:21 PM Stephan Hoyer notifications@github.com wrote:

Good question, I don't think we're using this currently. Is it possible to opt into v2 control flow now, while retaining TensorFlow 1.x compatibility?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tensorflow/probability/issues/405?email_source=notifications&email_token=AFJFSI2SNKCUQ2PUAUNRLTDPVXF4XA5CNFSM4HMQQOK2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVTDAEI#issuecomment-493236241, or mute the thread https://github.com/notifications/unsubscribe-auth/AFJFSI44TTOBAAQH2COP7XDPVXF4XANCNFSM4HMQQOKQ .

TakeByEarn commented 5 years ago

Hey @shoyer ,it's not easy to get the gradient throught ODE Solver, but there is anthor way the approximate the graientd. i guess the TF API: tf.contrib.integrate.odeint is builtt for calculate the result of ODE while it's gradients not take consideration. But this paper "Neural Ordinary Differential Equation" https://arxiv.org/abs/1806.07366 ,provided adjoint_odeint method to approximatly calculate the odeint 's gradient, and author also provide the source code based on torch.

shoyer commented 5 years ago

There has been some follow-up work on TFP's ODE solver. I think this might work now...

brianwa84 commented 5 years ago

Yes, DormandPrince implements the adjoint method for backprop. I don't think we've done it for BDF.