Closed decodering closed 1 year ago
It's been a while since I've done anything with this library (over 2-3 years) but from the looks of this:
mod4 = (
GroupedPredictor(DecayEstimator(DummyRegressor(), decay=0.9), groups=["m"])
.fit(df[['m']], df['yt'])
)
... I kind of get what's going awry. You're passing df[['m']]
and the GroupedEstimator
is making sure that m
is not passed to the other nested models. So you'd effectively be passing an empty dataframe to the DecayEstimator
, so it makes sense that it's complaining. The DummyRegressor
ignores all the input, while the DecayEstimator
does require some input to be around. This seems similar to what I'm doing in the screenshot, or am I skipping over something?
The documentation also shows the exact same examples.
I guess theoretically at least, we could choose to change the DecayEstimator
to not run the check_X_y
step anymore, or to make it optional. Internally, it doesn't need to use any of the X
-variables.
I think it might also make sense to be a bit more explicit in the docs to explain why index
is passed along.
Overview
Running the below works
while this doesn't (expected to work)
While following Vincent's excellent sklearn crash course (See relevant screenshot below), I noticed that I wasn't able to use
GroupedPredictor
to fit with a single feature dataframedf[['m']]
ifDecayEstimator
was used in conjunction. Is this the expected behaviour? I find it odd that callingDecayEstimator(DummyRegressor(),decay=0.9).fit(df[['m']],df['yt'])
works, but failes when wrapped wGroupedPredictor
.As far as I understand, in this example the
DummyRegressor
just calcs the mean ofdf['yt']
for the grouped period, so adding theindex
column here or any other cols don't seem to do anything (which looks to be the case when I tried adding a random col in place ofindex
). I'm not 100% sure on this, so please correct me if I'm wrong!Full error description and reproducable snippet below.
Error description
The error:
ValueError: Found array with 0 feature(s) (shape=(1825, 0)) while a minimum of 1 is required by DecayEstimator.
Snippet producing error
Example snippet:
Additional columns does not seem to be taking any effect on the model output:
Further comments
From what I can see in the error msg, it seems like it is reducing the dimensions so that it becomes an empty dataframe. I feel like this could be a result of dropping the
m
column somewhere for some reason when specifyinggrouping = ['m']
in theGroupedPredictor
params.