Open Vinayak285 opened 4 years ago
I think this error has to do with a change made to PyMC3's sample_posterior_predictive
method (formerly sample_ppc
, there is currently an open PR to change this). This discourse article describes how the shape of the returned array now depends on the shape of the training data. My fix for this was to define a data container (see this notebook) for the training data, and then swap out the data in the container for testing data prior to calling sample_posterior_predictive
.
Here's what I did specifically (all edits in bayesian.py)
:
In the functions for each model (e.g., model_returns_t
), define a data container.
with pm.Model() as model:
data_container = pm.Data("data", data)
mu = pm.Normal('mean returns', mu=0, sd=.01, testval=data.mean())
sigma = pm.HalfCauchy('volatility', beta=1, testval=data.std())
nu = pm.Exponential('nu_minus_two', 1. / 10., testval=3.)
returns = pm.StudentT('returns', nu=nu + 2, mu=mu, sd=sigma,
observed=data_container)
pm.Deterministic('annual volatility',
returns.distribution.variance**.5 * np.sqrt(252))
pm.Deterministic('sharpe', returns.distribution.mean /
returns.distribution.variance**.5 *
np.sqrt(252))
trace = pm.sample(samples)
run_model
, change
if ppc:
ppc_samples = pm.sample_ppc(trace, samples=samples,
model=model, size=len(returns_test))
return trace, ppc_samples['returns']
to
if ppc:
with model:
pm.set_data({"data" : returns_test})
ppc_samples = pm.sample_posterior_predictive(trace, samples=samples,
model=model)
return trace, ppc_samples['returns']
I'm not sure if this is the best way to fix this, but it works for me!
Problem Description
In bayesian.py, _plot_bayes_cone has variable perc which results in the error given down the post. So it is apparently of the wrong shape (wrong values - training, correct index - test. The culprit is variable preds, a dict, that is used in a function in tears.py where preds = ppc_t that in turn comes from the bayesian.run_model(model = 't', ppc = True....) where in,
Clearly (or maybe not), size should be wrong but doesn't look like it. I just replaced len(returns_test) with returns_test.shape[0] to see if it does something (it didn't). Now, I am at the end of my thinking capacity and hope someone else follows the trail that why this is happening (or maybe I am missing something trivial).
The full traceback:
Please provide any additional information below: Note that there are 1105 of in sample data and 101 out of sample data and if I change the shape of index of perc and the fill_between functions to returns_train.index, the sheet is generated so I think ppc_t needs to be generated for out of sample as compared to in sample data that it is doing now.
Versions