Closed koaning closed 3 months ago
thanks @koaning for spotting this, will take a look (cc @FBruzzesi )
It might not be related to narwhals now that I rethink it, but it seems plausible to assume this broke as part of the recent effort. Am also having a look now.
Actually ... this may be a scikit-learn thing. After cloning it seems that we loose the decay_kwargs
.
from sklearn.base import clone
est = DecayEstimator(DummyRegressor(), decay_func="exponential", decay_rate=0.2)
c = clone(est)
# lhs has kwargs, rhs does not
est.decay_kwargs, c.decay_kwargs
Ah yeah, this seems like a sklearn thing. The get_params
is not returning dictionaries it seems, which is what the cloning methods inside of sklearn assume. I wonder if there is a reason why they omit dictionaries. Might be concious ... @MarcoGorelli for now though feel free to relive yourself of any worry. This is not related to narwhals. Will rename the issue.
Sorry it took me a bit to look into this.
I see a few possibilities for this at the moment:
+ def __init__(self, model, decay_func="exponential", check_input=False, decay_kwargs=None):
- def __init__(self, model, decay_func="exponential", check_input=False, **decay_kwargs)
self.model = model
self.decay_func = decay_func
self.check_input = check_input
self.decay_kwargs = decay_kwargs
and then pass a dictionary instead of kwargs. This is a breaking change in our API
estimator.__sklearn_clone__()
, so we can act/implement such method - however this is available since scikit-learn 1.3from copy import deepcopy
instead of sklearn clone in a few meta estimators, but I am not fully aware of the consequences.Option 1. is the safest in terms of implementation. For option 2., scikit-learn 1.3 may be too recent, and option 3 requires more investigation.
Thoughts?
Sorry it took me a bit to look into this.
No worries! As always, there is no rush.
I see a few possibilities for this at the moment:
Ahh nice. I was thinking about option 2 myself originally but after considering it I think that option 1 is preferable. This is also how the class_weights
are attached and it feels like the best practice to follow. Not to mention the simplest implementation.
I spotted that an image in our docs no longer gives the right demo.
It is the one that relates to the grouped estimator that demos the decay estimator. Before, the decayed line would be different than the normally grouped one.
It seems that when we copy the decay estimators internally that we somehow drop
self.decay_kwargs
.