Closed yihaocs closed 1 year ago
Thanks for the contribution! Before we can merge this, we need @yihaocs to sign the Salesforce.com Contributor License Agreement.
Thanks for the great contributions Yihao! I've left a number of comments inline. I haven't looked in detail at the actual model/layer implementations. Maybe @chenghaoliu89 would be a better person for that part.
One issue that I did want to discuss independently is that of
target_seq_index
. It seems quite clear to me that we do want to support multivariate outputs. So instead of overridingtrain_pre_process()
inDeepForecasterBase
, can you do the following?
- Add a property
supports_multivariate_output
toForecasterBase
, which isFalse
by default.supports_multivariate_output
is overridden toTrue
for deep forecasters, and I can go back to set it toTrue
for other relevant models as well.- Update the base class
train_pre_process
to accommodate multivariate outputs.
Hi @aadyotb, @gorold will help us review this PR and further integrate DeepTIME into Merlion. Thanks Gerald and yihao!
Discussing some of the points @gorold brought up:
Model inputs are very specific to In/Autoformer. In particular the following code snippet should follow a more general formulation if we want to include more models which do not require decoder inputs. One way to overcome this issue, is to defer this portion of the code to the model. Each model simply takes as input the lookback window and covariates, and can construct these decoder inputs in their own forward call. (We also do not really need to distinguish between past and future time-varying covariates, the model can handle this).
I agree that the decoder inputs should be constructed by the individual models in their own forward()
calls. The use of a zero-vector for inference doesn't seem like a universal behavior when future
isn't provided. That said, I think that distinguishing between past/future is useful for making code more modular & understandable at the level of the base class. As an aside, @yihaocs note that start_token = past[:, -config.start_token_len :]
won't work as desired when start_token_length
is 0. For the correct behavior, you'd need to (1) check that past
is long enough, and (2) do past[:, len(past) - config.start_token_len :]
.
The current RollingWindowDataset does not support global models (i.e. learning from multiple time-series), which may be something we want to support. This would involve taking as input a List of TimeSeries, and each sample would also (optionally) yield an index/categorical indicating which time-series the window belongs to.
The plan is to add this functionality in a subsequent PR.
Thanks for the great suggestion. I changed the API of the model forward call; now we just put (past, past_timestamp, future, future_timestamp) into the forward function. Let me know if you have any further comments about this part @gorold @aadyotb
Thanks @aadyotb for letting me know about the bug. I have fixed it.
Discussing some of the points @gorold brought up:
Model inputs are very specific to In/Autoformer. In particular the following code snippet should follow a more general formulation if we want to include more models which do not require decoder inputs. One way to overcome this issue, is to defer this portion of the code to the model. Each model simply takes as input the lookback window and covariates, and can construct these decoder inputs in their own forward call. (We also do not really need to distinguish between past and future time-varying covariates, the model can handle this).
I agree that the decoder inputs should be constructed by the individual models in their own
forward()
calls. The use of a zero-vector for inference doesn't seem like a universal behavior whenfuture
isn't provided. That said, I think that distinguishing between past/future is useful for making code more modular & understandable at the level of the base class. As an aside, @yihaocs note thatstart_token = past[:, -config.start_token_len :]
won't work as desired whenstart_token_length
is 0. For the correct behavior, you'd need to (1) check thatpast
is long enough, and (2) dopast[:, len(past) - config.start_token_len :]
.The current RollingWindowDataset does not support global models (i.e. learning from multiple time-series), which may be something we want to support. This would involve taking as input a List of TimeSeries, and each sample would also (optionally) yield an index/categorical indicating which time-series the window belongs to.
The plan is to add this functionality in a subsequent PR.
A major shortcoming of Merlion's forecasting module is a lack of support for deep models that have been popular in the research literature. This PR addresses this gap by adding basic primitives which are shared by most deep learning models, and implementing the recently proposed DeepAR, Informer, Autoformer, and ETSFormer models.