jdb78 / pytorch-forecasting

Time series forecasting with PyTorch
https://pytorch-forecasting.readthedocs.io/
MIT License
3.84k stars 608 forks source link

Can tft model handles missing time data? #1244

Open lk1983823 opened 1 year ago

lk1983823 commented 1 year ago

Can the TFT model deal with missing time data? I have some raw data that losses some timestamps. So the datetime is not always continuous. Can the TFT in pytorch-forecasting handles automatically or I have to impute the missing data? Thanks.

ssvenoe commented 1 year ago

Hi, The TimeSeriesDataSet class can deal with missing data for you.

Look into these two options for TimeSeriesDataSet:

lk1983823 commented 1 year ago

Hi, The TimeSeriesDataSet class can deal with missing data for you.

Look into these two options for TimeSeriesDataSet:

  • constant_fill_strategy (Dict[str, Union[str, float, int, bool]]) – dictionary of column names with constants to fill in missing values if there are gaps in the sequence (by default forward fill strategy is used). The values will be only used if allow_missing_timesteps=True. A common use case is to denote that demand was 0 if the sample is not in the dataset.
  • allow_missing_timesteps (bool) – if to allow missing timesteps that are automatically filled up. Missing values refer to gaps in the time_idx, e.g. if a specific timeseries has only samples for 1, 2, 4, 5, the sample for 3 will be generated on-the-fly. Allow missings does not deal with NA values. You should fill NA values before passing the dataframe to the TimeSeriesDataSet.

@ssvenoe Thank you. But I missed too much data. It may not work to use a fill strategy. So does the package can train the data according to the corresponding continuous periods, like using a group variable to judge whether the time series data belongs to a certain period?

ssvenoe commented 1 year ago

Sorry for the late reply.

I believe you have to have "continous data".

One possible solution could be to remove the missing rows and create a idx from the length of your data. You can use date parameters as hour,day,month,year as covariates to the model to preserve date information.

I'm unsure on how this will reflect in your predictions, but it's a way to get around your problem of missing timesteps. Your dataset might also just not be suitable for deep learning.

sairamtvv commented 1 year ago

@lk1983823 I found a work around by assigning weights to zero. First fill all the missing time stamps and assign those missing with weights of zero

bignfuse commented 12 months ago

@sairamtvv , Can you elaborate how did you assign zero weights to missing time stamps. How do I do it using pytorch_forecasting tft library?

sairamtvv commented 11 months ago

Sorry for the late reply., but this is how i understand it. you can assign weights to the time stamps (like how much importance should be given). therefore, fill the missing values or ffill or bfill,. Assign very small value of weights to the time stamps that are missing. Probably,. some experts can also comment on this methodology

manitadayon commented 10 months ago

@sairamtvv, are you talking about weight attribute of TimeSeriesDataSet, I think @bignfuse is asking how you are doing it with Pytorch Forecasting (How can you input the weight)

sairamtvv commented 9 months ago

are you talking about weight attribute of TimeSeriesDataSet, Yes.

manitadayon commented 9 months ago

@sairamtvv, have you used the weight attributes for NhiTS, I seem to be getting dimension error, TFT works fine. I fixed NHiTS and send a PR wondering if you guys also receive error with weight attribute in NHiTS or other models besides TFT.