Allow for defining monotonic constraints in gradient boosted trees models

rijkvandermeulen commented 1 year ago

The gradient boosted trees models in Darts (XGBoost, CatBoost and LightGBM) all allow you to enforce a monotonic relationship between a feature/covariate and the target. This can be quite useful in cases where we already know of the existence of such a relationship and want to explicitly enforce it in our model. Useful to add to the Darts API as well for these models?

E.g., https://xgboost.readthedocs.io/en/stable/tutorials/monotonic.html

ymatzkevich commented 2 weeks ago

Hello @rijkvandermeulen, after discussing your suggestion we concluded that it could add value to the library but we are encountering some challenges with its implementation.

For time series, Darts translates the forecasting problem into a regression problem by defining lagged features (c.f. https://unit8co.github.io/darts/examples/20-RegressionModel-examples.html#Target-and-covariates-lags). Essentially, each lag defines a feature that is then given to the regressor.

In the case of enforcing monotonicity between the features and the target, it would require defining a constraint for each lag (i.e., enforcing a monotonic relationship between that specific lag and the target). This is not always well-defined and quickly becomes laborious for a large number of lags, which could compromise the user-friendliness of Darts.

We are still curious to know more about what you had in mind. Could you please provide more examples or elaborate on your suggestion to help us better understand?

rijkvandermeulen commented 2 weeks ago

hey @ymatzkevich , nice to hear that you considering adding this feature. And I definitely see your point in that it is not entirely straightforward to add this in a way that gives us the functionality that we need without compromising the user-friendliness of the API.

In fact, I think the "issue" might be even more complicated than you described. You mentioned it would require defining contraints for each lag and the target, but perhaps (to allow full flexibility) you would have to account for the different "chunks" in your output_chunk_length as well?

Some ideas:

Option 1: Ignoring lags and just assuming that a particular covariate has a monotonic relationship with (all) target(s). I.e., you would assume that the monotonic relationship applies for all lags of your covariate series (and all targets):

Advantage: simple/clean implementation
Disadvantage: the assumption does (I think?) often not hold in practice so it's questionable whether (when implemented in this way) the feature really adds value

Option 2: Specifying the monotonic relationships in detail. You could, for instance, let the user pass a dict with the following structure:

top level: your various covariates
2nd level: the chunks in your output chunk length
3rd level: the lags of a particular covariate
4rd level: monotonic constraint

BTW: for covariates - chunks - lags that are not specified in the dict, we just assume NO monotonic relation

Example:

{ "past_cov_x": { 1: { 1: "monotonic_increasing", 2: "monotonic_decreasing" } 2: { 2: "monotonic_increasing" } } "fut_cov_z": { 1: { 2: "monotonic_increasing" } } }

Advantage: this gives you full control over all relationships
Disadvantage: quite verbose and perhaps difficult to understand for the average user. However, I also don't see this functionality being used by your average user. It should be more intended (I think) for your more advanced users / specific use cases. By default we will just have it disabled and user-friendliness won't be affected.

PS: it might be useful to consider if a dict is the best data structure to use (we can also define some custom data classes for this purpose)

WDYT?

rijkvandermeulen commented 2 days ago

@ymatzkevich any thoughts?

unit8co / darts

Allow for defining monotonic constraints in gradient boosted trees models #1897