unit8co / darts

A python library for user-friendly forecasting and anomaly detection on time series.
https://unit8co.github.io/darts/
Apache License 2.0
7.91k stars 857 forks source link

Choosing a model with past and future covariates #571

Closed jojubart closed 2 years ago

jojubart commented 2 years ago

I am deciding on a model for my agriculture dataset where crop growth is measured. For that dataset, I have past data, such as the previously used amounts of fertilizers and past weather data. For some of these covariates, I also have information about the future, such as the next week's weather prediction and the planned amounts of fertilizer usage, for others, however, there is only past information. Additionally, the data is sampled per field in different locations. I'd like to feed the model with all of the data at hand, however, since they show the same general patterns.

After reading through the documentation, I think my options are quite limited and I have to either use a Mixed Model, so either the TFTModel or a RegressionModel.

Since the categorical metadata seems to be not implemented at the moment, what I am doing for the different fields is adding one-hot-encoded attributes to the dataset, signaling which field the data is sampled from and feeding the model with all that data. Does that approach make sense for the different fields or is there another go-to way to do this?

I am also quite interested in trying out some of the other models, such as NBEATS, but the future covariates in my dataset seem to be crucial for making accurate predictions.
Is there a workaround that is commonly used here? Such as using the lagged future covariates as past covariates?

I already want to thank you guys for creating this excellent library! It's been a pleasure to use so far and I'm looking forward to working with it :-)

hrzn commented 2 years ago

Hi @myblackbeard , your understanding is correct, and if you want to use both past and future covariates this way you indeed will have to use what we call a "Mixed" model, so either TFT or a RegressionModel. Like you mention, the only other workaround I see is shifting your future covariates values into the past, in order to use them as past covariates; but this won't always be a good idea, because it will put pressure on the models to "store" these past information for a longer time. If I were you, I would simply try first with the Mixed models and see. The way you are encoding static covariates looks good to me. We will add better support for those at some point :)

jojubart commented 2 years ago

Great, thanks a lot for the quick and detailed answer! :)