aerdem4 / lofo-importance

Leave One Feature Out Importance
MIT License
810 stars 83 forks source link

TimeSeriesSplit with Lofo #41

Closed kurucan closed 2 years ago

kurucan commented 2 years ago

is it possible to use sklearn TimeSeriesSplit with lofo?

karakastarik commented 2 years ago

It is possible.

from sklearn.model_selection import TimeSeriesSplit
kf = TimeSeriesSplit(n_splits=4)
from lofo import LOFOImportance, Dataset, plot_importance
%matplotlib inline

dataset = Dataset(df=your_data, target="target", features=[col for col in your_data.columns if col != 'target'])
print('---2---')
# define the validation scheme and scorer. The default model is LightGBM
lofo_imp = LOFOImportance(dataset, cv=kf,n_jobs=-1)
print('---3---')
# get the mean and standard deviation of the importances in pandas format
importance_df = lofo_imp.get_importance()

# plot the means and standard deviations of the importances
plot_importance(importance_df, figsize=(12, 20))

Just change the cross-validation strategy as time-series: cv=kf