Closed aiwalter closed 1 month ago
does anyone use pd.MultiIndex with forecasting? Its not relevant for classification/regression/clustering
MultiIndex
is all over the place for using forecasting/transformations with panel data. And such panel data is very common in industry, we have it also a lot at my day job and there I see severe performance issues with it.
@ltsaprounis was sharing this link in Slack: https://scikit-learn.org/stable/developers/performance.html#profiling-python-code
it could be a good start point.
possibly we have to implement some config to store info that some checks have already been done and dont need to be repeated X times again.
We could have a look at newly introduced config from scikit-learn
: https://scikit-learn.org/dev/modules/generated/sklearn.set_config.html#sklearn.set_config
todo: additionally improve check exception message. Currently the checks are called in base classes and therefore its not possible to see for user directly in which estimator the check was failing. This could also be improved probably by handing over the parent class name(s) or tracking that automatically with some inspect magic?
Is your feature request related to a problem? Please describe. There seems to be quite a problem related to performance of some data checks and
pd.MultiIndex
operations.Describe the solution you'd like Related: https://github.com/sktime/sktime/issues/4139