koaning / scikit-lego

Extra blocks for scikit-learn pipelines.
https://koaning.github.io/scikit-lego/
MIT License
1.28k stars 118 forks source link

Hierarchical predictors #623

Closed FBruzzesi closed 8 months ago

FBruzzesi commented 9 months ago

Description

Type of change

Checklist:

FBruzzesi commented 9 months ago

Forgot to mention a couple of ideas:

koaning commented 8 months ago

At the moment I'd be fine with this:

{
    (1,): LogisticRegression(),
    (1, 'A'): LogisticRegression(),
    (1, 'B'): LogisticRegression(),
    (1, 'A', 'X'): LogisticRegression(),
    (1, 'B', 'Y'): LogisticRegression(),
}

But it would be preferable to have this documented in the code with comments. I agree that it's merely "ok", but it feels clear enough if the comment is there, no?

FBruzzesi commented 8 months ago

But it would be preferable to have this documented in the code with comments. I agree that it's merely "ok", but it feels clear enough if the comment is there, no?

Just added a few comments on those. Here is how it looks like: image

koaning commented 8 months ago

Ah yeah, that's even nicer. Just in the comments would've been sufficient but adding it in the docs for sure is a nice touch.

FBruzzesi commented 8 months ago

My main observation is that we might want to add an extra test that uses a dummy model to help predict the values that we'd expect. Other than that; this looks great! Good work :)

I am thinking out loud here but...maybe the easiest way to check this is to use a deterministic/fake predictions model. Does that sound reasonable?

koaning commented 8 months ago

Oh that was totally in line with what I had in mind. I've used Dummy models for this in the past but you're also free to pick another method. As long as we just have a test that our assumptions on how we shrink play out.