feature-engine / feature_engine

Feature engineering package with sklearn like functionality
https://feature-engine.trainindata.com/
BSD 3-Clause "New" or "Revised" License
1.85k stars 307 forks source link

Testing Linear Regression Assumptions #581

Open NicoGalli opened 1 year ago

NicoGalli commented 1 year ago

Checking model assumptions is like commenting code. Everybody should be doing it often, but it sometimes ends up being overlooked in reality. A failure to do either can result in a lot of time being confused, going down rabbit holes, and can have pretty serious consequences from the model not being interpreted correctly.

What about creating the following functions to test linear regression assumptions?

linear_assumption()
normal_errors_assumption()
multicollinearity_assumption()
autocorrelation_assumption()
homoscedasticity_assumption()

Please check this amazing post: https://jeffmacaluso.github.io/post/LinearRegressionAssumptions/

Morgan-Sell commented 1 year ago

Nice idea, @NicoGalli!

Has anyone started this task? If not, I'll take it on.

NicoGalli commented 1 year ago

Hi @Morgan-Sell!

Yes, please. Dont hesitate to assign this task to you and start working on it.

Morgan-Sell commented 1 year ago

Hi @NicoGalli,

I understand the request. However, I'm questioning whether this task should be within the scope of feature-engine. feature-engine focuses on automating feature engineering/selection that is compatible with scikit-learn.

Assessing linear regression assumptions falls under model evaluation, which I believe to beyond the scope of feature-engine.

@solegalli, do you have any thoughts?

solegalli commented 1 year ago

Yes, I think that for the time being, this will be out of scope. There are other issues with higher priority.

Maybe in a few years :p