Closed fingoldo closed 1 month ago
Hi @fingoldo,
We have the demo example how to use it in the right way.
Alex
Thanks a lot, Alexander! Passing it as cv_iter parameter to fit_predict did the trick. But the reader object also has the cv parameter. How do cv of the reader and cv_iter of TabularAutoML itselt interplay? it's not clear from the docs. Maybe documentation of TimeSeriesIterator can be improved by at least referencing the demo example script? It's really hard to find.
@fingoldo to figure out what is cv
and any other parameter means during TabularAutoML
preset creation please check our fully commented YAML config which helps LightAutoML figure out what to do.
To be clear, cv
param means the number of folds for cross-validation.
And yes, you are right that our documentation is not 100% clear, we are working on it.
Alex
Mm i'm still confused.
To be clear, cv param means the number of folds for cross-validation.
But what happens when I specify both cv=3 for the reader, and pass cv_iter=TimeSeriesIterator(datetime_col=df.loc[train_idx,"date"],n_splits=5) to fit_predict?
which of the splitters will be used, Kfold with 3 or TimeSeries with 5 splits?
@fingoldo cv_iter (if explicitly specified) should overwrite cv param
Question
My dataset is ordered by time and usual KFOLD cross-validation results in poor test performance. How do I use ts-based cross-validation? I noticed there is TimeSeriesIterator in LAMA, but no example of using it anywhere.
I tried
but get
What's the correct way of using TimeSeriesIterator?