Bug in use of inverse_transform when predicting effects

jakobrunge / tigramite

Tigramite is a python package for causal inference with a focus on time series data. The Tigramite documentation is at

https://jakobrunge.github.io/tigramite/

GNU General Public License v3.0

1.24k stars 271 forks source link

Bug in use of inverse_transform when predicting effects #370

Closed rebeccaherman1 closed 8 months ago

rebeccaherman1 commented 9 months ago

I'm getting a broadcasting error when I add data_transform=sklearn.preprocessing.StandardScaler() to fit_total_effect and transform_interventions_and_prediction=True to predict_total_effect. It appears that the call to reshape(-1,1) is putting the number of features on the 0th axis and the number of samples on the first, which is reversed compared to the documentation in sklearn. I think perhaps it should be removed or replaced with reshape(1,-1).

https://github.com/jakobrunge/tigramite/blob/a409b37d3d290ec99c5c62ab6417a7ba18421329/tigramite/models.py#L335C96-L335C111

jakobrunge commented 8 months ago

Can you propose a patch?

rebeccaherman1 commented 8 months ago

@jakobrunge please let me know if you'd like me to propose this separately in the development branch as well

jakobrunge commented 8 months ago

Solved in master now.

rebeccaherman1 commented 8 months ago

@jakobrunge Now it seems that the stored observation_array after fitting a causal_effects instance is stored so that the 0-th axis is the vector and the 1st axis is time, which seems backwards. I notice a lot of transposes in the code for Transform the data if needed in get_general_fitted_model. I don't know why all of it is there. Are we sure those transposes are meant to be there? I'm sorry I did not look into the code in other places more thoroughly before.

rebeccaherman1 commented 8 months ago

On second thought, it does actually look like you are accessing the array assuming that the 0th axis moves over different variables rather than the first. So, maybe there's no problem here. Just seems un-intuitive to me, and it also seems to require lots of additional code logic that may have contributed to making the original bug more likely to happen.