WillianFuks / tfcausalimpact

Python Causal Impact Implementation Based on Google's R Package. Built using TensorFlow Probability.
Apache License 2.0
600 stars 72 forks source link

Printing averages of the posterior does not work #24

Closed ghost closed 3 years ago

ghost commented 3 years ago

Hi Willian, thank you very much for the package!

One question: following your article (https://towardsdatascience.com/implementing-causal-impact-on-top-of-tensorflow-probability-c837ea18b126), I tried to print out the averages for each model component using:

for name, values in ci.model_samples.items():
    print(f'{name}: {values.numpy().mean(axis=0)}')

However, I am getting the following error:

'list' object has no attribute 'items'

How can I get the components of the model displayed?

Many thanks!

WillianFuks commented 3 years ago

Hi @ludmila-kuncarova ,

I'm guessing you fitted your model using hmc method witch returns a list of the model samples instead of a dict with name and samples.

One thing you can do in order to retrieve it as a dict (if this is what you're looking for) is to run the following:

param_samples = {param.name: ci.model_samples[i] for (i, param) in enumerate(ci.model.parameters)}

Keys of param_samples will be, as an example:

['observation_noise_scale',
 'LocalLevel/_level_scale',
 'SparseLinearRegression/_global_scale_variance',
 'SparseLinearRegression/_global_scale_noncentered',
 'SparseLinearRegression/_local_scale_variances',
 'SparseLinearRegression/_local_scales_noncentered',
 'SparseLinearRegression/_weights_noncentered']

Notice that if you have a sparse linear regression installed in your model, you need to convert its respective samples to the final weights of the regression. Here's how:

import tensorflow as tf

weights_prior_scale = 0.1
global_scale_nonentered = param_samples['SparseLinearRegression/_global_scale_noncentered']
global_scale_variance = param_samples['SparseLinearRegression/_global_scale_variance']
local_scales_noncentered = param_samples['SparseLinearRegression/_local_scales_noncentered']
local_scale_variances = param_samples['SparseLinearRegression/_local_scale_variances']
global_scale = global_scale_nonentered * tf.sqrt(global_scale_variance) * weights_prior_scale
weights_noncented = param_samples['SparseLinearRegression/_weights_noncentered']
local_scales = local_scales_noncentered * tf.sqrt(local_scale_variances)

weights = weights_noncented * local_scales * global_scale[..., tf.newaxis]

To get the average of the estimated weight on each data point you can run now:

weights.numpy().mean(axis=1)

Or a global average:

weights.numpy().mean(axis=1).mean()

This probably needs to be better documented, I'll work on that soon.

If you still have any questions let me know.

ghost commented 3 years ago

Hi @WillianFuks , thank you so much for the detailed explanation - super helpful!