pymc-devs / pymc

Bayesian Modeling and Probabilistic Programming in Python
https://docs.pymc.io/
Other
8.68k stars 2.01k forks source link

prediction with gaussian mixture model regression #3518

Closed Fangwq closed 5 years ago

Fangwq commented 5 years ago

I try to do regression with gaussian mixture model refer to https://docs.pymc.io/notebooks/dependent_density_regression.html and https://discourse.pymc.io/t/gaussian-mixture-of-regression/537. However, when I do prediction, it always give me the error below. It is definitely the problem that array size doesn't match when I draw the figure. Does anybody knows why? Thank you very much.

The code I use here:

    x = theano.shared(X_train, broadcastable=(False, True))
    y = theano.shared(Y_train, broadcastable=(False, True))
    with pm.Model() as model:
        K = x.get_value().shape
        K = K[0]
        phi = pm.Dirichlet('phi', np.array([1.0]*K), testval=np.ones(K)/K)

        # Priors for unknown model parameters
        alpha = pm.Normal('alpha', mu=0, sd=100, shape=K) #Intercept
        beta = pm.Normal('beta', mu=0, sd=100, shape=K)
        sigma  = pm.HalfCauchy('sigma', 5, shape=K)  #Noise

        mu = alpha + beta*x

        x_obs = pm.NormalMixture('obs', phi, mu, sd = sigma, observed=y)
    with model:
        step = pm.Metropolis()
        trace = pm.sample(SAMPLES, step, chains=4, tune=BURN, random_seed=SEED)

    PP_SAMPLES = 50
    x.set_value(X_complete)
    pp_trace = pm.sample_posterior_predictive(trace, PP_SAMPLES, model=model, random_seed=SEED)
    low, high = np.percentile(pp_trace['obs'], [2.5, 97.5], axis=0)
    ax.fill_between(X_complete.flatten(), low, high, color='k', alpha=0.35, zorder=5,
                label='95% posterior credible interval')

The error here:

Traceback (most recent call last):
  File ".../code/bayesian/gaussian_paper.py", line 85, in <module>
    label='95% posterior credible interval')
  File "/Library/Python/2.7/site-packages/matplotlib/__init__.py", line 1855, in inner
    return func(ax, *args, **kwargs)
  File "/Library/Python/2.7/site-packages/matplotlib/axes/_axes.py", line 5092, in fill_between
    map(np.ma.getmask, [x, y1, y2]))
ValueError: operands could not be broadcast together with shapes (110,) (100,) 
lucianopaz commented 5 years ago

These sort of questions are better suited for our discourse channel.

You can't change the number of mixture components dynamically (which is what you are trying to do when you set K = x.get_value().shape). When you later change x with x.set_value(X_complete), the previous number of mixture components don't match, and thus mu = alpha + beta*x wont work.

Fangwq commented 5 years ago

Thanks for your reply. You mention that I can't change the number of components. But when I set K = const initially, this problem still exits. Why does it happen ? @lucianopaz

    K = 5
    with pm.Model() as model:
        # K = x.get_value().shape
        # K = K[0]
        phi = pm.Dirichlet('phi', np.array([1.0]*K), testval=np.ones(K)/K)

        # Priors for unknown model parameters
        alpha = pm.Normal('alpha', mu=0.5, sd=10, shape=K) #Intercept
        beta = pm.Normal('beta', mu=0.5, sd=10, shape=K)
        sigma  = pm.HalfCauchy('sigma', 1.0, shape=K)  #Noise

        mu = alpha + beta*x

        x_obs = pm.NormalMixture('obs', phi, mu, sd = sigma, observed=y)