unit8co / darts

A python library for user-friendly forecasting and anomaly detection on time series.
https://unit8co.github.io/darts/
Apache License 2.0
7.91k stars 857 forks source link

feature_importances_ for LightGBMModel ? #2280

Closed Allena101 closed 6 months ago

Allena101 commented 6 months ago

I might have misunderstood something, but has the feature_importance functionality been removed in the darts implementation? If that is the case, what is the best way to achieve the same thing?

madtoinou commented 6 months ago

Hi @Allena101,

Darts is not removing any functionality from the wrapped libraries. You can access the underlying model in the model attribute and use all the methods listed in the original documentation. There is a small section about this in the example notebook, be careful the model is sometime nested in a MultiOutputRegressor if there are several components.

from darts.models import LightGBMModel
import darts.utils.timeseries_generation as tg
from darts import TimeSeries

ts = tg.linear_timeseries(length=100)

m = LightGBMModel(lags=3, output_chunk_length=1, verbose=-1)
m.fit(ts)
m.model.feature_importances_
Allena101 commented 6 months ago
m.model.feature_importances_

thank you 🙏

Allena101 commented 6 months ago
feature_importances_

So do I understand this correctly that you can only do this with output_chunk_length=1 ?

madtoinou commented 6 months ago

You can also do it when output_chunk_length>1 but you'll need to use the method get_multioutput_estimator(horizon, target_dim) to obtain the underlying model as they will be wrapped in the MultiOutputRegressor class.

Allena101 commented 5 months ago

You can also do it when output_chunk_length>1 but you'll need to use the method get_multioutput_estimator(horizon, target_dim) to obtain the underlying model as they will be wrapped in the MultiOutputRegressor class.

Thanks! I got it working with get_multioutput_estimator! I have read the docs for it but I does not make me understand target_dim. target_dim (int) – The index of the target component. Is this only relevant if you have multivaraite models (redicitng more than one feature)?

I got this to work but i have no idea if its correct. my output_chunk_length is = 3. And i have covariates , but only have one target feature. feature_importance = LGBM_Model.get_multioutput_estimator(2, 0)

Also, i cannot find which of of feature importance i get. importance_type="split" or importance_type="gain". Can i get both using darts?

Further more on explainability. I have read and tried your guys TFTModel guide and it works well thus far! However i cant get any other explainability model working. I cant find if you have a guide for your explainability model. Right now i cant even figure out how to import it.

I tried a bunch of ways but i just get errors: from darts.explainability import SomeComponentBasedExplainer

explainer = SomeComponentBasedExplainer(model_nbeats) explain_results = explainer.explain() output = explain_results.get_explanation(component="some_component")

madtoinou commented 5 months ago

horizon is the step within the output_chunk_length. The target_dim is only relevant for multivariate target series. Your code snipped looks good, it will correspond to the last step of output_chunk_length.

The feature importance in this context is provided by LightGBM, its type will depend on the parameters provided to the model at creation. So yes, you can probably get both.

For explainability, I would recommend checking these pages of the documentation (here and there) and the regression example notebook. Note that TFTExplainer is specific to this model, it cannot be used for NBEATS.

A notebook dedicated to explainability is on the way but not finished yet, you could always check some examples from the shap library, that Darts is using for these functionalities.