delete 1 hour data in training dataset can lead to much better results

Dear Author, Hope you are doing well! Recently, I am testing with the TimeGrad code. I find a really interesting thing: If truncate 1 hour data in the training dataset and keep the test dataset unchanged. The test results can be much better. Results for the electricity dataset are as follows： all the settings epoch=30, learning rate=1e-03 diffusion steps=100, batch_size=32

for the whole train dataset that is input size 370*5833; the crps_sum over 10 runs are 0.0205±0.0033
for the train dataset truncate the first 1 h data, that is input size is 370*5832; the crps_sum over 10 runs is 0.0139±0.0015.

I am really confused with the results, as it is not expected that the truncation of 1 hour data could lead to such a big difference on the same test dataset. I was wondering if you could give some insights on why such results happens. Thanks so much for your help!

Best,

zalandoresearch / pytorch-ts

delete 1 hour data in training dataset can lead to much better results #143