Open Seam8 opened 2 years ago
Hey @Seam8, were you able to test if this technique is reliable? I am currently extending the validation set by creating TimeSeriesDataSet just like training set from the most recent data. I tried concatenating dataset and dataloader but no luck.
@Sharaddition , I am regularly using this technique with a previous version of pytorch-forecasting.
I have just sync my forks with recent commits. I will create some tests to make sure nothing got broken and everything work as expected. I will propose a pull request then
Hello @Seam8, I was trying your changes, but I'm receiving following error:
the simultaneous use of min_prediction_idx and prediction_windows is not possible
I have tried to create a fork for latest version here: https://github.com/Sharaddition/pytorch-forecasting
Can you please guide, what I'm doing wrong here, could it be the issue in latest version of library?
data = data_df
last_time_idx = data.time_idx.max()
prediction_windows = []
for window_end in range(last_time_idx, last_time_idx-(24*30*6), -24*7*5):
prediction_windows.append([window_end - (24*7), window_end])
validation_time_idx = np.concatenate([np.arange(window[0], window[1]+1) for window in prediction_windows])
training = TimeSeriesDataSet(
data.loc[~data.time_idx.isin(validation_time_idx)].reset_index(drop=True),
# data,
time_idx="time_idx",
allow_missing_timesteps=None,
target="smoothed",
group_ids=["group"]
)
validation = TimeSeriesDataSet.from_dataset(
training,
data,
allow_missing_timesteps=None,
prediction_windows=prediction_windows
)
Any help is very much appreciated!
I have been working for some time with Temporal Fusion Transformers from pytorch_forecasting and I was facing an annoying tradeoff:
My validation set was too short meaning it was not representative enough of the whole datasets. In the meantime, a longer validation set imposed to get rid of a large part of the most recent data from the training set.
So I implemented a custom feature allowing to split the validation set into different time intervals. In my case, it allowed to stabilize the validation loss during training. See below:
Basically, the change allow to :
In case it could be useful for some other people, I've created a fork:
github.com/seam8/pytorch-forecasting/tree/feature/split_validation_set
Yet I am not really sure of what I am doing with poetry, never used it before... I am actually running into an import error when I try to run pytest with it:
Note that I initially implemented the feature on a previous release. So I've integrated the changes to the current master branch, this is why I wanted to run pytest on it, to make sure nothing got broken.
So if someone can tell me what I am missing with poetry, I would finish the tests. cheers