sktime / sktime

A unified framework for machine learning with time series
https://www.sktime.net
BSD 3-Clause "New" or "Revised" License
7.58k stars 1.29k forks source link

[ENH] resolve duplications between `PolynomialTrendForecaster` and `TrendForecaster` #4134

Open fkiraly opened 1 year ago

fkiraly commented 1 year ago

Currently, PolynomialTrendForecaster and TrendForecaster are duplicative, as the latter is a special case of the former via copy-paste.

These duplications should be removed (once PR https://github.com/sktime/sktime/pull/4133 is merged).

A straightforward way might be using _DelegatedForecaster and have TrendForecaster as a special case wrap the PolynomialTrendForecaster, although there are other ways to do this as well.

phershbe commented 1 year ago

@fkiraly I'd like to work on this. I made a pull request here before, deleting two things, and I'm pretty inexperienced in general, but I've been doing more pull requests on various projects and taking on something a little bit bigger each time and it's been going well.

phershbe commented 1 year ago

I've spent a few days going through code and tutorial videos in order to understand everything better. I'm still trying to understand everything but especially _DelegatedForecaster, but in general is the idea that we will have TrendForecaster(_DelegatedForecaster) and then PolynomialTrendForecaster(TrendForecaster)?

fkiraly commented 1 year ago

@phershbe, thanks for looking at this! Let us know if you need any help, and feel free to attend the community collab session for a chat (Fridays at 4pm UTC).

Regarding the general idea:

phershbe commented 1 year ago

@fkiraly Thank you for letting me know about the session on Friday, I would have liked to have come but didn't check the comment on here until later in the day.

I'd prefer the TrendForecaster_DelegatedForecaster since that's what you suggested initially. I'm still trying to understand things better. I played around with some forecasters in Jupyter Notebook and things are starting to make more sense.

I'll create a draft pull request later today ... it might be a mess, but I want to try, and you can let me know what you think if you want.

fkiraly commented 1 year ago

I'll create a draft pull request later today ... it might be a mess, but I want to try, and you can let me know what you think if you want.

No worries, that's how we all got started :-)

phershbe commented 1 year ago

@fkiraly No draft pull request yet, I have a few questions if you don't mind...

You said above TrendForecaster_DelegatedForecaster, did you mean TrendForecaster(_DelegatedForecaster) or TrendForecaster_DelegatedForecaster(BaseForecaster) or actually TrendForecaster_DelegatedForecaster with no inheritance?

Related to the previous question, but are you talking about the _DelegatedForecaster in sktime/forecasting/base/_delegate.py? Would we be dealing with the wrapper described in the docstring...

Wrapped estimator is value of attribute with name self._delegate_name.
By default, this is "estimator_", i.e., delegates to self.estimator_
To override delegation, override _delegate_name attribute in child class.

...or something totally different?

I'm kind of in over my head here but I'm trying because I find time series analysis and forecasting really interesting and I've gotten pretty decent at algorithms because I like these things that deal with mathematics.

fkiraly commented 1 year ago

You said above TrendForecaster_DelegatedForecaster, did you mean TrendForecaster(_DelegatedForecaster) or TrendForecaster_DelegatedForecaster(BaseForecaster) or actually TrendForecaster_DelegatedForecaster with no inheritance?

Apologies, this was a typo. It should have been TrendForecaster(_DelegatedForecaster), and I fixed it in my post to avoid future confusion.

fkiraly commented 1 year ago

Related to the previous question, but are you talking about the _DelegatedForecaster in sktime/forecasting/base/_delegate.py?

Yes - a good example of how an estimator with fixed parameters is presented as another, via the delegation mechanism, would be Catch22Classifier, which internally wraps a pipeline (Catch22 transformer and a classifier), but exposes it as an new class, Catch22Classifier.

phershbe commented 1 year ago

@fkiraly Great, thank you, the specific class example is very helpful. I'm going to come to the session later today too.

ksharma6 commented 4 months ago

Mind if I tackle this @fkiraly ?

fkiraly commented 4 months ago

yes, of course! This hasn't been carried out in the end, so it is still available.

If I read the current state correctly, the next step is to turn PolynomialTrendForecaster into a delegate of TrendForecaster with the pipeline, used internally.

ksharma6 commented 4 months ago

Hi @fkiraly I wanted to confirm my understanding of the issue at hand: PolynomialTrendForecaster and TrendForecaster have a lot of overlap since TrendForecaster is a unique case of PolynomialTrendForecaster

A potential solution to this might be:

Does this sound like an accurate summary Franz?

fkiraly commented 4 months ago

Hi @fkiraly I wanted to confirm my understanding of the issue at hand:

Sure!

PolynomialTrendForecaster and TrendForecaster have a lot of overlap

yes

since TrendForecaster is a unique case of PolynomialTrendForecaster

That is true, but it is also true the other way round - PolynomialTrendForecaster is a special case of TrendForecaster, where you substitute the pipeline as regressor.

It is a bit subtle since:

Therefore, I would agree to your potential solution number 1, but the other way round. PolymonialTrendForecaster should inherit the _DelegatedForecaster, and internally delegate to the - code-wise - specialization of TrendForecaster.