WillianFuks / tfcausalimpact

Python Causal Impact Implementation Based on Google's R Package. Built using TensorFlow Probability.
Apache License 2.0
610 stars 72 forks source link

have an error when using customized model #76

Open justdoit456258 opened 11 months ago

justdoit456258 commented 11 months ago

Hi @WillianFuks ,

Now I was trying to run example with customized model, but it doen't work

when running ’ci = CausalImpact(data1, pre_period, post_period, model= model1,model_args={'standardize': True})‘, I encountered the error: ValueError: Input model must be of type UnobservedComponents.

can you help me ? Thank you !

example:

import pandas_datareader as pdr import datetime from causalimpact import CausalImpact import numpy as np import pandas as pd import matplotlib.pyplot as plt from causalimpact.misc import standardize

data = pd.read_csv('volks_data.csv') data.plot() data1 = data[['Date','VolksWagen']]

pre_period = ['2011-01-02', '2015-09-13'] post_period = ['2015-09-20', '2017-03-19'] reg_data = tfp.sts.regularize_series(data1) normed_data = standardize(reg_data.astype(np.float32))[0]

obs_data = normed_data.loc[pre_period[0]: pre_period[1]].iloc[:, 0]

design_matrix = pd.concat([normed_data.loc[pre_period[0]: pre_period[1]], normed_data.loc[post_period[0]: post_period[1]]] ).astype(np.float32).iloc[:, 1:] linear_level = tfp.sts.LocalLinearTrend(observed_time_series=obs_data) linear_reg = tfp.sts.LinearRegression(design_matrix=design_matrix) model1 = tfp.sts.Sum([linear_level, linear_reg], observed_time_series=obs_data) ci = CausalImpact(data1, pre_period, post_period, model= model1,model_args={'standardize': True})

WillianFuks commented 11 months ago

Hi @justdoit456258 ,

Apparently you are using the wrong package. Please run pip install -U tfcausalimpact to see if the issue goes away.

justdoit456258 commented 11 months ago

@WillianFuks Thanks for your reply. I just run pip install -U tfcausalimpact. But when I add the sentence:
ci = CausalImpact(data1, pre_period, post_period, model= model1,model_args={'standardize': True}) , it also causes the new error.

AssertionError Traceback (most recent call last) ~\AppData\Local\Temp\ipykernel_27204\1617991335.py in ----> 1 ci = CausalImpact(data1, pre_period, post_period, model = model)

~\second_anacode_setup\envs\py37\lib\site-packages\causalimpact\main.py in init(self, data, pre_period, post_period, model, model_args, alpha) 264 ): 265 processed_input = cidata.process_input_data(data, pre_period, post_period, --> 266 model, model_args, alpha) 267 self.data = data 268 self.processed_data_index = processed_input['data'].index

~\second_anacode_setup\envs\py37\lib\site-packages\causalimpact\data.py in process_input_data(data, pre_period, post_period, model, model_args, alpha) 139 ) 140 if model: --> 141 cimodel.check_input_model(model, pre_data, post_data) 142 else: 143 model = cimodel.build_default_model(

~\second_anacode_setup\envs\py37\lib\site-packages\causalimpact\model.py in check_input_model(model, pre_data, post_data) 184 if isinstance(model, tfp.sts.Sum): 185 for component in model.components: --> 186 _check_component(component) 187 else: 188 _check_component(model)

~\second_anacode_setup\envs\py37\lib\site-packages\causalimpact\model.py in _check_component(component) 175 'instead.' 176 ) --> 177 assert component.design_matrix.dtype == tf.float32 178 else: 179 for parameter in component.parameters:

AssertionError:

WillianFuks commented 11 months ago

Hi @justdoit456258 ,

As it seems your input weight matrix should be casted to type tf.float32 as otherwise tensorflow probability will not work when computing the posterior of the model. See if running something like: linear_weight_tensor = tf.cast(linear_weight_tensor, tf.float32) solves the issue.

justdoit456258 commented 11 months ago

Thank you for your reply! @WillianFuks

I found that this code is causing the AssertionError: linear_reg = tfp.sts.LinearRegression(design_matrix=normed_data.iloc[:, 1:].values.reshape(-1, normed_data.shape[1] -1)) And I have found the reason. My input data only has 1 column(it‘s y),so it is unable to construct the matrix.

input data example: image

But now,I have new question that I need to consult with you.

Even if the P-value is lower then 0.05,the prediction curve appears to have clearly not learned historical trends. It is a clear downward trend, but during the same period in history, it has been rising first and then decreasing.Why didn't the model learn this pattern?How can I optimize it?

image image

Sincerely looking forward to your new reply!

WillianFuks commented 11 months ago

Hi @justdoit456258 ,

There are many factors that should be weighted in that may explain the observed. Some ideas to investigate:

This tends to be an exploration and you can use the results and inferred data to guide you on what is working or not. You could also use some error metrics such as MAE (if it's appropriate to your use case) to guide you on which model is working best.

justdoit456258 commented 11 months ago

Hi @WillianFuks , Thank you for your reply!

I have tried to add seasonal components and covariates of festival tags to train my model. But there is still no improvement. justdoit20231110 txt.txt

The attachment file is my code.Could you you run it to see where it can be optimized?