PicklingError when trying to run hierarchical regression

drapadubok commented 9 years ago

I was trying to mimic the example with radon and households with my own data. I'm only starting to learn pymc3, so maybe my error is related to clueless model specification, so I will happily appreciate any suggestions! What I try, is to take a chunk of data which corresponds to time series from 21 individuals, each 1015 time points long. I'm trying to specify a hierarchical model, so that each individual coefficient in regression will be tied to some group mean and variance.

import numpy as np
import pymc3

# Throw in some simulated data
x = np.random.normal(size=(21315))
y = np.random.normal(size=(21315))
n_subj = 21 # hardcoded number of individuals
s_idx = [[subj]*1015 for subj in range(21)] # 1015 - hardcoded number of timepoints
sidx = np.concatenate(sidx)

with pymc3.Model() as hierarchical_model:
    # Group hyperpriors
    mu_intercept = pymc3.Normal('mu_intercept', mu=0, sd=20)
    sigma_intercept = pymc3.Uniform('sigma_intercept', lower=0, upper=100)
    mu_x_coeff = pymc3.Normal('mu_x_coeff', mu=0, sd=20)
    sigma_x_coeff = pymc3.Uniform('sigma_x_coeff', lower=0, upper=100)

    intercept = pymc3.Normal('intercept', mu=mu_intercept, sd=sigma_intercept, shape=n_subj)
    x_coeff = pymc3.Normal('x_coeff', mu=mu_x_coeff, sd=sigma_x_coeff, shape=n_subj)

    eps = pymc3.Uniform('eps', lower=0, upper=100)

    est = intercept[s_idx] + x_coeff[s_idx] * x

    likelihood = pymc3.Normal('y', mu=est, sd=eps, observed=y)

    start = pymc3.find_MAP()
    step = pymc3.NUTS(scaling=start)
    trace = pymc3.sample(2000, step, start=start, njobs=4, progressbar=True)

However, when I try to run it, I get following error: _pickle.PicklingError: Can't pickle <function interval_transform..interval_real at 0x2b7475ec1f28>: attribute lookup interval_real on pymc3.distributions.transforms failed

I'm having a hunch that my problem is related with the way I supply inputs, but I'm clueless about the way to dig out the reasons for this error. I would appreciate any suggestions. Thank you!

jsalvatier commented 9 years ago

That's actually because the transform object that Uniform uses can't be pickled. I'll put up a fix in a few minutes for this.

For now you can fix the problem by setting njobs=1.

jsalvatier commented 9 years ago

Try this branch: https://github.com/pymc-devs/pymc3/tree/transform_classes

That fixes the specific issue you were having.

drapadubok commented 9 years ago

Thank you, setting njobs to 1 made it work in the original branch I used.

But after I tried the transform_classes branch, in both njobs==1 and njobs==4 cases I get similar error. I apologize in advance if the question is more about modeling than about the code, but now I guess the error is about the way model is specified. One of the things error message suggests is to remove hierarchical parameters, but having them was the idea for me.

Traceback (most recent call last):
  File "<stdin>", line 11, in <module>
  File "/scratch/braindata/shared/GraspHyperScan/tempanaconda/pymc3-tree/pymc3/pymc3/tuning/starting.py", line 127, in find_MAP
    specific_errors)
ValueError: Optimization error: max, logp or dlogp at max have non-finite values. Some values may be outside of distribution support. max: {'sigma_intercept_interval': array(-54.24292371278075), 'eps_interval': array(100.0), 'mu_intercept': array(0.0), 'intercept': array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]), 'sigma_x_coeff_interval': array(-54.24292371278075), 'mu_x_coeff': array(0.0), 'x_coeff': array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])} logp: array(-inf) dlogp: array([  0.,  nan,   0.,  nan,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,  nan])Check that 1) you don't have hierarchical parameters, these will lead to points with infinite density. 2) your distribution logp's are properly specified. Specific issues: 
eps_interval.logp bad: -inf
eps_interval.dlogp bad: [ nan]
x_coeff.logp bad: -inf
x_coeff.dlogp bad at idx: (array([0]),) with values: [ nan]
intercept.logp bad: -inf
intercept.dlogp bad at idx: (array([0]),) with values: [ nan]
sigma_x_coeff_interval.logp bad: -inf
sigma_intercept_interval.logp bad: -inf

jsalvatier commented 9 years ago

So this is an issue with the optimization step. Optimization has returned bad values where you have -infinity for both the log posterior and gradient. The issue is probably that you're optimizing over both over sigma_intercept/intercept at the same time and sigma_x_coeff/x_coeff at the same time, since you can make the logp infinite by setting sigma_intercept to be basically zero and all the intercepts to be very close to each other.

The fix is to just optimize over intercept, x_coeff, mu_intercept, mu_x_coeff, eps, leaving out sigma_intercept and sigma_x_coeff.

start = find_MAP(vars = [intercept, x_coeff, mu_intercept, mu_x_coeff, eps])

drapadubok commented 9 years ago

Thank you for advice! I've tried it, but it still raises this error. Also, for the sake of testing I removed all the variables mentioned in the error from optimization and I still get something like this:

with pymc3.Model() as hierarchical_model:
    # Hyperpriors for group nodes
    mu_intercept = pymc3.Normal('mu_intercept', mu=0, sd=20)
    sigma_intercept = pymc3.Uniform('sigma_intercept', lower=0, upper=100)
    mu_x_coeff = pymc3.Normal('mu_x_coeff', mu=0, sd=20)
    sigma_x_coeff = pymc3.Uniform('sigma_x_coeff', lower=0, upper=100)
    #
    intercept = pymc3.Normal('intercept', mu=mu_intercept, sd=sigma_intercept, shape=x.shape[1])
    # Intercept for each county, distributed around group mean mu_a
    x_coeff = pymc3.Normal('x_coeff', mu=mu_x_coeff, sd=sigma_x_coeff, shape=x.shape[1])
    # Model error
    eps = pymc3.Uniform('eps', lower=0, upper=100)
    #
    est = intercept[s_idx] + x_coeff[s_idx] * x
    # Data likelihood
    likelihood = pymc3.Normal('y', mu=est, sd=eps, observed=y)
    #
    start = pymc3.find_MAP(vars = [mu_intercept, mu_x_coeff])
    step = pymc3.NUTS(scaling=start)
    trace2 = pymc3.sample(2000, step, start=start, njobs=4, progressbar=True)

And the error:

Traceback (most recent call last):
  File "<stdin>", line 11, in <module>
  File "/scratch/braindata/shared/GraspHyperScan/tempanaconda/pymc3-tree/pymc3/pymc3/tuning/starting.py", line 127, in find_MAP
    specific_errors)
ValueError: Optimization error: max, logp or dlogp at max have non-finite values. Some values may be outside of distribution support. max: {'mu_x_coeff': array(0.0), 'sigma_x_coeff_interval': array(100.0), 'mu_intercept': array(0.0), 'eps_interval': array(100.0), 'intercept': array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]), 'sigma_intercept_interval': array(100.0), 'x_coeff': array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])} logp: array(-inf) dlogp: array([  0.,  nan,   0.,  nan,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
         0.,   0.,  nan])Check that 1) you don't have hierarchical parameters, these will lead to points with infinite density. 2) your distribution logp's are properly specified. Specific issues:

Maybe there are some tests I can run to get more information to figure it out?

jsalvatier commented 9 years ago

Can you print out model.test_point? It seems like the sigmas must be set to zero, but they should be logit transformed, so that shouldn't even be possible.

Also, can you post the data generation code? I get an error about a shape mismatch when I run your model.

drapadubok commented 9 years ago

Oh, sorry for the error, mixed two models that I was playing around, here is the code that I used to reproduce the error:

import numpy as np
import pymc3

# simulated data
n_subj = 21 # hardcoded number of individuals
n_time = 1015 # hardcoded number of timepoints
x = np.random.normal(size=(n_subj*n_time))
y = np.random.normal(size=(n_subj*n_time))
s_idx = [[subj]*n_time for subj in range(n_subj)]
sidx = np.concatenate(s_idx)

with pymc3.Model() as hierarchical_model:
    # Hyperpriors for group nodes
    mu_intercept = pymc3.Normal('mu_intercept', mu=0, sd=20)
    sigma_intercept = pymc3.Uniform('sigma_intercept', lower=0, upper=100)
    mu_x_coeff = pymc3.Normal('mu_x_coeff', mu=0, sd=20)
    sigma_x_coeff = pymc3.Uniform('sigma_x_coeff', lower=0, upper=100)
    #
    intercept = pymc3.Normal('intercept', mu=mu_intercept, sd=sigma_intercept, shape=n_subj)
    # Intercept for each county, distributed around group mean mu_a
    x_coeff = pymc3.Normal('x_coeff', mu=mu_x_coeff, sd=sigma_x_coeff, shape=n_subj)
    # Model error
    eps = pymc3.Uniform('eps', lower=0, upper=100)
    #
    est = intercept[sidx] + x_coeff[sidx] * x
    # Data likelihood
    likelihood = pymc3.Normal('y', mu=est, sd=eps, observed=y)
    #
    start = pymc3.find_MAP(vars = [intercept, x_coeff, mu_intercept, mu_x_coeff, eps])
    step = pymc3.NUTS(scaling=start)
    trace3 = pymc3.sample(2000, step, njobs=4, progressbar=True)

And for this model, hierarchical_model.test_point gives out:

{'mu_x_coeff': array(0.0), 'sigma_x_coeff_interval': array(100.0), 'mu_intercept': array(0.0), 'eps_interval': array(100.0), 'intercept': array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]), 'sigma_intercept_interval': array(100.0), 'x_coeff': array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])}

drapadubok commented 9 years ago

If it might help, there is probably related thing, when I try to run unpooled version of this model in the transform_classes branch, the traces I get are all around zero, even though when I run it on master branch - I get reasonably correct traces. So with the same model, I get completely different results on two branches. Here is the code and model.test_point for both branches:

master branch >> model.test_point
{'sigma_log': array(2.3025851249694824), 'x_coeff': array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]), 'Intercept': array(0.0)}

transform_classes >> model.test_point 
{'Intercept': array(0.0), 'x_coeff': array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]), 'sigma_log': array(22026.46484375)}

import numpy as np
import seaborn as sns
import pymc3
import pandas as pd

n_subj = 21 # hardcoded number of individuals
n_time = 1015 # hardcoded number of timepoints
x = np.random.normal(size=(n_time,n_subj))
y = np.random.normal(size=(n_time,n_subj))

with pymc3.Model() as model: # model specifications in PyMC3 are wrapped in a with-statement
    # Define priors
    sigma = pymc3.HalfCauchy('sigma', beta=10)
    intercept = pymc3.Normal('Intercept', 0, sd=20)
    x_coeff = pymc3.Normal('x_coeff', 0, sd=20, shape=x.shape[1])

    # Define likelihood
    likelihood = pymc3.Normal('y', mu=intercept + x_coeff * x, 
                              sd=sigma, observed=y)

    start = pymc3.find_MAP() # Find starting value by optimization
    step = pymc3.NUTS(scaling=start) # Instantiate MCMC sampling algorithm
    trace = pymc3.sample(2000, step, start=start, njobs=4, progressbar=True) # draw 2000 posterior samples using NUTS sampling

Visualization:

tdf_all = pd.DataFrame(np.concatenate(trace.get_values('x_coeff')))
ax = sns.violinplot(tdf_all[500:])
plt.show()

jsalvatier commented 9 years ago

Sorry, the transform_classes branch had some changes that fixed some bugs, it works for me if you get the latest version of that branch. Also is there a reason you use violinplot instead of traceplot?

jsalvatier commented 9 years ago

You can also do sns.violinplot(trace.get_values('x_coeff', combine=True))

drapadubok commented 9 years ago

Alright, I'll try out the latest version!

Well, I use it mostly because I want to have similar X axis scale, for example, if I want to compare the posterior distributions of two variables visually, I can't do it on traceplot, as they are all centered on their mean, I guess, but with this kind of violin plot, I can see the distributions with similar scale and if there are large differences - see them straight away. I usually check the traceplot too, but I found that having similar scale makes me personally a little more comfortable at exploring :)

jsalvatier commented 9 years ago

Makes sense!

On Wed, Jun 17, 2015 at 12:22 AM, Dmitry Smirnov notifications@github.com wrote:

Alright, I'll try out the latest version!

Well, I use it mostly because I want to have similar X axis scale, for example, if I want to compare the posterior distributions of two variables visually, I can't do it on traceplot, as they are all centered on their mean, I guess, but with this kind of violin plot, I can see the distributions with similar scale and if there are large differences - see them straight away. I usually check the traceplot too, but I found that having similar scale makes me personally a little more comfortable at exploring :)

— Reply to this email directly or view it on GitHub https://github.com/pymc-devs/pymc3/issues/752#issuecomment-112690109.

drapadubok commented 9 years ago

Cool, now with the latest version this issue is gone and model computes successfully! Thank you very much!

pymc-devs / pymc

PicklingError when trying to run hierarchical regression #752