pymc-devs / pymc

Bayesian Modeling and Probabilistic Programming in Python
https://docs.pymc.io/
Other
8.71k stars 2.01k forks source link

NotImplementedError for T.repeat()? #960

Closed Miles-Garnsey closed 8 years ago

Miles-Garnsey commented 8 years ago

Hi there,

I'm running the bleeding edge version of Theano and Pymc3, coming across an odd (and uninformative) error and I'm having trouble determining where the bug is.

I'm effectively trying to do a rolling regression as described here. In my case, I have multiple observations per day, so I'd like these to share the same regression coefficients.

The only change from the code in that example is to do with the way I've done the repetition of the alpha_r and beta_r variables, which I've changed to be

alpha_r = T.repeat(alpha, rep_pattern) beta_r = T.repeat(beta,rep_pattern)

where rep_pattern = T.constant(dates_series.values, dtype="int32") which is an array of the counts grouped by day.

Sorry if this is an obvious question, it's just that I've been working on it for some time and have hit a wall, still a good chance it's user error, but I wanted to check with the experts.

Full code;

import theano.tensor as T
import theano as th
import numpy as np
import pandas as pd

test_data = test_data.sort('date')

model_randomwalk = pm.Model()
with model_randomwalk:
    # std of random walk, best sampled in log space.
    sigma_alpha = pm.Exponential('sigma_alpha', 1./.02, testval = .1)
    sigma_beta = pm.Exponential('sigma_beta', 1./.02, testval = .1)  

    alpha = pm.GaussianRandomWalk('alpha', sigma_alpha**-2, 
                                  shape=len(reps))
    beta = pm.GaussianRandomWalk('beta', sigma_beta**-2, 
                                 shape=len(reps))

    dates_series = test_data.groupby(['date']).var_x.count() #group by dates with number of cases in each.
    num_dates = T.constant(len(dates_series), dtype = "int32") #count unique dates
    rep_pattern = T.constant(dates_series.values, dtype="int32") #number of cases in each 
    test_data = test_data.sort()

    # Make coefficients have the same length as requests
    alpha_r = T.repeat(alpha, rep_pattern)
    beta_r = T.repeat(beta,rep_pattern)

    logit_p = alpha_r + beta_r * test_data.var_x
    p = T.exp(logit_p)/(1+T.exp(logit_p))

    #output variables
    sd = pm.Uniform('sd', 0, 20)
    likelihood = pm.Bernoulli('y', p = p,                    
                           observed=test_data.var_y.values)

with model_randomwalk:
    # First optimize random walk
    start = pm.find_MAP(vars=[alpha, beta], fmin=optimize.fmin_l_bfgs_b)

The error I get back is as below:

NotImplementedError                       Traceback (most recent call last)
<ipython-input-16-20a16792a332> in <module>()
      1 import scipy.optimize as optimize
      2 with model_randomwalk:
----> 3     pm.find_MAP(vars=[alpha, beta], fmin=optimize.newton)

C:\Users\user\AppData\Local\Continuum\Anaconda2\lib\site-packages\pymc3\tuning\starting.pyc in find_MAP(start, vars, fmin, return_raw, disp, model, *args, **kwargs)
     68 
     69     logp = bij.mapf(model.fastlogp)
---> 70     dlogp = bij.mapf(model.fastdlogp(vars))
     71 
     72     def logp_o(point):

C:\Users\user\AppData\Local\Continuum\Anaconda2\lib\site-packages\pymc3\model.pyc in fastdlogp(self, vars)
    147     def fastdlogp(self, vars=None):
    148         """Compiled log probability density gradient function"""
--> 149         return self.model.fastfn(gradient(self.logpt, vars))
    150 
    151     def fastd2logp(self, vars=None):

C:\Users\user\AppData\Local\Continuum\Anaconda2\lib\site-packages\pymc3\memoize.pyc in memoizer(*args, **kwargs)
     12 
     13         if key not in cache:
---> 14             cache[key] = obj(*args, **kwargs)
     15 
     16         return cache[key]

C:\Users\user\AppData\Local\Continuum\Anaconda2\lib\site-packages\pymc3\theanof.pyc in gradient(f, vars)
     52 
     53     if vars:
---> 54         return t.concatenate([gradient1(f, v) for v in vars], axis=0)
     55     else:
     56         return empty_gradient

C:\Users\user\AppData\Local\Continuum\Anaconda2\lib\site-packages\pymc3\theanof.pyc in gradient1(f, v)
     42 def gradient1(f, v):
     43     """flat gradient of f wrt v"""
---> 44     return t.flatten(t.grad(f, v, disconnected_inputs='warn'))
     45 
     46 

C:\Users\user\AppData\Local\Continuum\Anaconda2\lib\site-packages\theano\gradient.pyc in grad(cost, wrt, consider_constant, disconnected_inputs, add_names, known_grads, return_disconnected, null_gradients)
    559 
    560     rval = _populate_grad_dict(var_to_app_to_idx,
--> 561                                grad_dict, wrt, cost_name)
    562 
    563     for i in xrange(len(rval)):

C:\Users\user\AppData\Local\Continuum\Anaconda2\lib\site-packages\theano\gradient.pyc in _populate_grad_dict(var_to_app_to_idx, grad_dict, wrt, cost_name)
   1322         return grad_dict[var]
   1323 
-> 1324     rval = [access_grad_cache(elem) for elem in wrt]
   1325 
   1326     return rval

C:\Users\user\AppData\Local\Continuum\Anaconda2\lib\site-packages\theano\gradient.pyc in access_grad_cache(var)
   1277                     for idx in node_to_idx[node]:
   1278 
-> 1279                         term = access_term_cache(node)[idx]
   1280 
   1281                         if not isinstance(term, gof.Variable):

C:\Users\user\AppData\Local\Continuum\Anaconda2\lib\site-packages\theano\gradient.pyc in access_term_cache(node)
   1111                                 str(g_shape))
   1112 
-> 1113                 input_grads = node.op.grad(inputs, new_output_grads)
   1114 
   1115                 if input_grads is None:

C:\Users\user\AppData\Local\Continuum\Anaconda2\lib\site-packages\theano\tensor\extra_ops.pyc in grad(self, inputs, gout)
    671             # of repeats in order to split gz in the right way to sum
    672             # the good part.
--> 673             raise NotImplementedError()
    674         else:
    675             raise ValueError()

NotImplementedError: 

I'm assuming this is because the gradient is not available for some component in my model - does anyone have any ideas which part might be causing the error?

Miles-Garnsey commented 8 years ago

This issue might be related to #535, but I'm uncertain what the current state of play is for the randomwalk type?

hvasbath commented 8 years ago

The gradient is not implemented for T.repeat. You could circumvent this by making alpha and beta the right shape including your rep_pattern. I dont really know how your test data looks like. But broadcasting is implemented in theano. And repeat would propably not be necessary if you put your data in the right format to make use of broadcasting. Chers!

Miles-Garnsey commented 8 years ago

I've managed to resolve this one by referring back to some material included in #535. If anyone else is having a similar issue;

The code A.repeat(B) fails because there is no gradient implemented where ndim of B >0.

But you can use a generator expression inside Tensor.concatenate like so;

N = length(rep_pattern)
T.concatenate([alpha[i].repeat(rep_pattern[i]) for i in range(N)])

This appears to work, although it also uses a lot of memory and has thrown some errors (constant folding error, and also a few memoryerrors where Python itself has crashed suddenly). I'll keep this threat up to date if this solution proves insufficient.

My only outstanding question is whether I should maybe raise this issue in the Theano project to see if maybe this might be a better way to handle things generally in T.repeat().

hvasbath commented 8 years ago

You can also use T.alloc in order to have an alternative to repeat. Its gradient is implemented. They are well aware of this, so you dont need to report that ;) . Chers!