pymc-devs / pymc

Bayesian Modeling and Probabilistic Programming in Python
https://docs.pymc.io/
Other
8.7k stars 2.01k forks source link

Optimizing a parameter for an outside function #3365

Closed BrianMiner closed 5 years ago

BrianMiner commented 5 years ago

I am trying to fit a simple linear regression as proof of concept for a larger problem, whereby I am transforming the X variable according to the ad stock transformation.

My question is how to optimize a parameter that is passed to an external function during fitting and is there any way then to have that found optimal value be taken into consideration (i.e. used) when you predict new data using sample_posterior_predictive()?

Here is a self-contained program and a note about what seems to work and when this fails.

from statsmodels.tsa.filters.filtertools import recursive_filter import pymc3 as pm import numpy as np import pandas as pd

normalize to range 0-100

def normalize(x,min_=0,max_=100):
    print(x)
    min_x=np.min(x)
    max_x=np.max(x)
    z=(max_-min_)/(max_x-min_x)*(x-max_x)+max_

    return(z)
def adstock(x,rate=0.09):
    return (recursive_filter(x,rate))
sales=np.array([1018,0,236,490,1760,443,1670,526,4522,2524,400,2527,4602,168,2795,7195,6277,2974,
5268,4310,2127,1081,4794,806,2565,1223,4141,2994,4079,1883,635,1980,1275,4497,1579,2726,
1901,4683,1686,1745,1404,1096,2825,2331,1711,2041,1210,914,4162,1166,4228,914])

advert=np.array([0,0,0,0,0,0,0,0,0,117.91,120.11,125.83,115.35,177.09,141.65,137.89,0,0,0,0,0,0,0,0,0,
0,158.51,109.39,91.08,79.25,102.71,78.49,135.11,114.55,87.34,107.83,125.02,82.96,60.81,83.15,0,0,
0,0,0,0,129.51,105.49,111.49,107.1,0,0])

pd_sales=pd.DataFrame(np.column_stack([sales,advert]), columns=['sales','advert'])

HERE THIS WORKS WHEN HARD-CODING THE PARAMETER

with pm.Model() as mod_as:
    a = pm.Normal('a', mu=1000, sd=500)
    bA = pm.Normal('bA', mu=0, sd=2)
    sigma = pm.Uniform('sigma', lower=0, upper=10)

    mu = a + bA * normalize(adstock(pd_sales.advert.values,0.09))
    sales_hat = pm.Normal('sales', mu=mu, sd=sigma, observed=pd_sales.sales)
    trace_5_3_S = pm.sample(1000, tune=1000,cores=4)

HERE THIS FAILS

with pm.Model() as mod_as:
    a = pm.Normal('a', mu=1000, sd=500)
    bA = pm.Normal('bA', mu=0, sd=2)
    sigma = pm.Uniform('sigma', lower=0, upper=10)
    ad_rate = pm.Uniform('ad_rate', lower=0, upper=1)

    mu = a + bA * normalize(adstock(pd_sales.advert.values,ad_rate))
    sales_hat = pm.Normal('sales', mu=mu, sd=sigma, observed=pd_sales.sales)
    trace_5_3_S = pm.sample(1000, tune=1000,cores=4)

TypeError Traceback (most recent call last)

in () 5 ad_rate = pm.Uniform('ad_rate', lower=0, upper=1) 6 ----> 7 mu = a + bA * normalize(adstock(pd_sales.advert.values,ad_rate)) 8 sales_hat = pm.Normal('sales', mu=mu, sd=sigma, observed=pd_sales.sales) 9 trace_5_3_S = pm.sample(1000, tune=1000,cores=4) in normalize(x, min_, max_) 3 4 def normalize(x,min_=0,max_=100): ----> 5 min_x=np.min(x) 6 max_x=np.max(x) 7 z=(max_-min_)/(max_x-min_x)*(x-max_x)+max_ ~/anaconda3/lib/python3.6/site-packages/numpy/core/fromnumeric.py in amin(a, axis, out, keepdims, initial) 2440 """ 2441 return _wrapreduction(a, np.minimum, 'min', axis, None, out, keepdims=keepdims, -> 2442 initial=initial) 2443 2444 ~/anaconda3/lib/python3.6/site-packages/numpy/core/fromnumeric.py in _wrapreduction(obj, ufunc, method, axis, dtype, out, **kwargs) 81 return reduction(axis=axis, out=out, **passkwargs) 82 ---> 83 return ufunc.reduce(obj, axis, dtype, out, **passkwargs) 84 85 ~/anaconda3/lib/python3.6/site-packages/theano/tensor/var.py in __bool__(self) 89 else: 90 raise TypeError( ---> 91 "Variables do not support boolean operations." 92 ) 93 TypeError: Variables do not support boolean operations.
brandonwillard commented 5 years ago

mu = a + bA * normalize(adstock(pd_sales.advert.values,ad_rate))

You're passing a Theano variable, ad_rate, to Numpy; what you need to do is pass the Numpy/numeric value of ad_rate to the Numpy operations. Theano has an as_op you can use for that.

twiecki commented 5 years ago

ALso these questsions should go to the discourse.

BrianMiner commented 5 years ago

@brandonwillard thanks. I moved this to discourse: https://discourse.pymc.io/t/optimizing-a-parameter-for-an-outside-function/2629 Still have an error. I am hoping this doesn't require theano knowledge :(