jamalsenouci / causalimpact

Python port of CausalImpact R library
Apache License 2.0
262 stars 62 forks source link

Key Error: 'upper y' #13

Closed pgrandinetti closed 4 years ago

pgrandinetti commented 6 years ago

I have the following pandas object

print(type(causal))
print(causal.columns)
print(type(causal.index))
print(causal.head())
<class 'pandas.core.frame.DataFrame'>
Index(['y', 'x1', 'x2'], dtype='object')
<class 'pandas.core.indexes.datetimes.DatetimeIndex'>
              y   x1   x2
date                     
2017-09-04  150  150  275
2017-09-05  200  249  125
2017-09-06  225  150  249
2017-09-07  150  125  275
2017-09-08  175  325  250

I set variables pre_period and post_period as in your documentation. Then I run

impact = CausalImpact(causal, pre_period, post_period)
impact.run()

and I get

KeyError: 'upper y'

Can you please point me in the right direction? Thanks!

pgrandinetti commented 6 years ago

Reproducible error

df = pandas.DataFrame(
    {'y': [150, 200, 225, 150, 175],
     'x1': [150, 249, 150, 125, 325],
     'x2': [275, 125, 249, 275, 250]
    }
)
ci = CausalImpact(df, [0,2], [3,4])
ci.run()
pgr-me commented 6 years ago

I have the same issue.

rpanai commented 6 years ago

@pgr-me me too.

rpanai commented 6 years ago

I have the same problem when I use pandas=0.23.0 but it works with pandas=0.19.2

phqchau commented 6 years ago

I'm also having this issue, and downgrade to pandas 0.19.2 doesn't solve it.

The error is in line 31 of inferences.py: point_pred_upper = ci["upper y"].to_frame()

The object ci couldn't be indexed by the key "upper y." I could not find out where this index value come from.

rpanai commented 6 years ago

@phqchau the error comes exactly from that line. I've two environments, both with pandas=0.19.2, and in one I got error while in the other it works fine.

gjb2107 commented 6 years ago

Same problem and downgrade did not fix

gjb2107 commented 6 years ago

@rpanai I set up a separate environment with pandas=0.19.2 as well and it's not working for me there either. Do you know the difference between the two environments that you're using? Perhaps a different version of something else?

What's strange is that this was working for me on an older version, but I'm unsure which Anaconda release that was.

jamalsenouci commented 6 years ago

struggling to reproduce this on my environment, could I confirm what version of causalimpact you have installed

rpanai commented 6 years ago

@jamalsenouci My version is 0.1.1

jamalsenouci commented 6 years ago

could you first try upgrading to the latest version (0.1.3) using pip and see if this solves the problem

gjb2107 commented 6 years ago

@jamalsenouci I'm using 0.1.3

gjb2107 commented 6 years ago

I just downgraded and then re-upgraded to 0.1.3 and am getting a new error now

AttributeError: module 'pandas.tseries' has no attribute 'index'

EDIT: Back to original 'upper y' error now

jamalsenouci commented 6 years ago

I can't reproduce this. I have taken the following steps and it works:

conda create -q -n test-environment python=3.6
(source) activate test-environment
pip install causalimpact
python
import pandas
from causalimpact import CausalImpact
df = pandas.DataFrame(
    {'y': [150, 200, 225, 150, 175],
     'x1': [150, 249, 150, 125, 325],
     'x2': [275, 125, 249, 275, 250]
    }
)
ci = CausalImpact(df, [0,2], [3,4])
ci.run()

I'm not sure where the conflict is happening. One option is to create a clean environment as above to run your analysis.

rpanai commented 6 years ago

@jamalsenouci I tried to follow your steps. I need to pip install pandas, scipy and statsmodels too but I had the very same error KeyError: 'upper y'

rpanai commented 6 years ago

Apparently the culprit is statsmodels>0.8.0 I create a new environment with statsmodels=0.8.0 and causalimpact is running fine. I built a conda package of your library and I think it would be nice if you can do the same.

gjb2107 commented 6 years ago

This worked for me as well!

rpanai commented 6 years ago

@gjb2107 I built a conda package you can install with

conda install -c rpanai causalimpact
HowellPan commented 5 years ago

rpanai, I have the same issue even with statsmodels = 0.8.0, does pandas version also matter? also I can't install your package, complains about not available in the current channels.

HowellPan commented 5 years ago

anyone figured out the final solution for this? I am getting KeyError: 'upper y' with statsmodel 0.9.0 AND statsmodels 0.8.0 with pandas version 0.20.3

gjb2107 commented 5 years ago

@HowellPan Everything is working now for me with statsmodels 0.8.0 and pandas 0.23.1

I now have another issue (Same as #7 )

russ-ai commented 5 years ago

Is there any update on this error? I'm using the most recent version, but the issue is that the keys "upper y" and "lower y" aren't defined anywhere else in the code, so naturally pandas complains when it is converted to a dataframe on line 31 of inferences.py

russ-ai commented 5 years ago

Update: error message is there for pandas 0.23.0, but it works with 0.23.1 (statsmodels 0.8.0 in both cases)

a-baturin commented 5 years ago

Pull request above works for me with statsmodels==0.9.0 and pandas==0.23.4

vineet290794 commented 5 years ago

I am still getting this error along with Type Error : an integer is required Pandas =0.19.2 and statsmodel = 0.8.I am new to python , can anyone help

airnel48 commented 5 years ago

Pull request above gives me KeyError: 'upper y' with statsmodel 0.9.0 and pandas version 0.23.4. Obtaining the same error in my own program.

chandanshikhar1 commented 5 years ago

I am getting same error with statmodels 0.9.0 and pandas 0.24.2? Any solutions?

I tried the following but does work for me

conda install -c rpanai causalimpact

chandanshikhar1 commented 5 years ago

Anyone looking at this actively? I am unable to use the package

rpanai commented 5 years ago

@chandanshikhar1 I'm using causalimpact you can install the conda package I made from here or here. The latter is compiled for Windows and Mac too.

chandanshikhar1 commented 5 years ago

I get following error now

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
~/anaconda3/envs/jamal_37/lib/python3.6/site-packages/statsmodels/compat/pandas.py in <module>
     52 try:
---> 53     import pandas.tseries.tools as datetools
     54     import pandas.tseries.frequencies as frequencies

ModuleNotFoundError: No module named 'pandas.tseries.tools'

During handling of the above exception, another exception occurred:

ImportError                               Traceback (most recent call last)
<ipython-input-9-7c81a40e9fd9> in <module>
      1 impact = CausalImpact(eu_upload_rel_df, ["2019-02-01", "2019-03-31"], ["2019-04-01", "2019-04-17"])
----> 2 impact.run()

~/anaconda3/envs/jamal_37/lib/python3.6/site-packages/causalimpact/analysis.py in run(self)
     39             self._run_with_data(kwargs["data"], kwargs["pre_period"],
     40                                 kwargs["post_period"], kwargs["model_args"],
---> 41                                 kwargs["alpha"], self.params["estimation"])
     42         else:
     43             self._run_with_ucm(kwargs["ucm_model"],

~/anaconda3/envs/jamal_37/lib/python3.6/site-packages/causalimpact/analysis.py in _run_with_data(self, data, pre_period, post_period, model_args, alpha, estimation)
    295 
    296         # Construct model and perform inference
--> 297         ucm_model = construct_model(self, df_pre, model_args)
    298         res = model_fit(self, ucm_model, estimation, model_args["niter"])
    299 

~/anaconda3/envs/jamal_37/lib/python3.6/site-packages/causalimpact/model.py in construct_model(self, data, model_args)
     61 
     62     """
---> 63     from statsmodels.tsa.statespace.structural import UnobservedComponents
     64 
     65     # extract y variable

~/anaconda3/envs/jamal_37/lib/python3.6/site-packages/statsmodels/tsa/statespace/structural.py in <module>
     12 import numpy as np
     13 import pandas as pd
---> 14 from statsmodels.tsa.filters.hp_filter import hpfilter
     15 from statsmodels.tools.data import _is_using_pandas
     16 from statsmodels.tsa.tsatools import lagmat

~/anaconda3/envs/jamal_37/lib/python3.6/site-packages/statsmodels/tsa/filters/hp_filter.py in <module>
      4 from scipy.sparse.linalg import spsolve
      5 import numpy as np
----> 6 from ._utils import _maybe_get_pandas_wrapper
      7 
      8 

~/anaconda3/envs/jamal_37/lib/python3.6/site-packages/statsmodels/tsa/filters/_utils.py in <module>
      2 
      3 from statsmodels.tools.data import _is_using_pandas
----> 4 from statsmodels.tsa.base import datetools
      5 from statsmodels.tsa.tsatools import freq_to_period
      6 

~/anaconda3/envs/jamal_37/lib/python3.6/site-packages/statsmodels/tsa/base/datetools.py in <module>
      4 from statsmodels.compat.python import (lrange, lzip, lmap, string_types, long,
      5                                        callable, asstr, reduce, zip, map)
----> 6 from statsmodels.compat.pandas import datetools
      7 import re
      8 import datetime

~/anaconda3/envs/jamal_37/lib/python3.6/site-packages/statsmodels/compat/pandas.py in <module>
     54     import pandas.tseries.frequencies as frequencies
     55 except ImportError:
---> 56     from pandas.core import datetools
     57     frequencies = datetools

ImportError: cannot import name 'datetools'
rpanai commented 5 years ago

@chandanshikhar1 Do you mind to add the output of these commands?

import pandas as pd
import statsmodels as sm

print("statsmodels: {}".format(sm.__version__))
print("pandas: {}".format(pd.__version__))
chandanshikhar1 commented 5 years ago

statsmodels: 0.8.0 pandas: 0.24.2

rpanai commented 5 years ago

Try to create a new environment and install the libraries in this order

conda create -n ci_env python=3.6.5
source activate ci_env
conda install numpy=1.15.4=py36h7e9f1db_0
conda install pandas=0.23.4=py36h04863e7_0
conda install -c teamcore causalimpact
chandanshikhar1 commented 5 years ago
PackagesNotFoundError: The following packages are not available from current channels:

  - numpy==1.15.4=py36h7e9f1db_0

So I did

conda create -n ci_env python=3.6.5
source activate ci_env
conda install numpy=1.15.4
conda install pandas=0.23.4
conda install -c teamcore causalimpact

and then I get another error

jupyter notebook
jupyter: command not found
rpanai commented 5 years ago

You should install jupyter inside the environment.

UPDATED commands

conda create -n ci_env python=3.6.5
source activate ci_env
conda install numpy=1.15.4
conda install pandas=0.23.4=py36h04863e7_0
conda install -c teamcore causalimpact
conda install ipython=7.3.0
conda install jupyter
chandanshikhar1 commented 5 years ago

Yup, it works. Thanks a lot for helping with this.

rpanai commented 5 years ago

@chandanshikhar1 Not very related but remember that you can add environment kernel to jupyter. In other words you install jupyter as usual and in every environment you add this

source activate my_env
# You need this version as the 7.4 has a major bug
conda install ipython=7.3.0
conda install ipykernel
python -m ipykernel install --user --name my_env --display-name "my_env"

So you can launch jupyter as usual and then select the kernel for every notebook.

anaalonsocastillo commented 5 years ago

Faced the same issue and the solution proposed worked. Now its giving a new error importing comb from scipy.

cannot import name 'comb'

I am using scipy==1.3.0 in case this helps

rpanai commented 5 years ago

@anaalonsocastillo my scipy version is scipy==1.2.0 and I don't have any error. Could you try to downgrade? Then could you check if you have scikit-learn? It seems that is a problem with this module see so

anaalonsocastillo commented 5 years ago

yep, I had the scikit-learn installed. Downgrading to scipy==1.2.0 solved the issue. Thanks!

jamalsenouci commented 4 years ago

have removed reference to upper_y so this should be fixed