zalandoresearch / pytorch-ts

PyTorch based Probabilistic Time Series forecasting framework based on GluonTS backend
MIT License
1.21k stars 191 forks source link

Can not replicate the experiment results in the paper #142

Open xinyaofan opened 1 year ago

xinyaofan commented 1 year ago

Dear author,

I run the example notebook, "Time-Grad-Electricity.ipynb". I use the default hyperparameter setting (shown in the example notebook) for the dataset "electrictiy_nips" dataset. However, I could not replicate the results 0.0206+-0.001 in the paper.
I could only get 0.0223±0.0016 for running 10 times. I am not sure if the version of package "gluonts" will potentially affect the results. I currently use the version 0.10.0 for this package. May I ask the settings or tips for getting the baseline results in the paper? For example, did you set some random seed, or change some hyperparameter settings different from the default settings in the notebook? Thanks so much for your help! I am looking forward to your reply!

p.s I sometimes can also get similar results to the notebook, but not always. I am not sure how to always get such good results....

hanlaoshi commented 1 year ago

I got the same result as you, 0.0223±0.0015

kashif commented 1 year ago

I think the issue is the diffusion models i am currently using, namely diffusers vs originally i had my own denoising diffusion implementation. I can try to revert the changes back to my one. The issue I think with diffusers is that it is very much for image or fixed value outputs like between 0 and 255 or 0 and 1, whereas in the time series setting the values should not be clipped. The default beta-starts and beta-ends are also very much for images I believe. I do not know what do you think?

hanlaoshi commented 1 year ago

I think the issue is the diffusion models i am currently using, namely diffusers vs originally i had my own denoising diffusion implementation. I can try to revert the changes back to my one. The issue I think with diffusers is that it is very much for image or fixed value outputs like between 0 and 255 or 0 and 1, whereas in the time series setting the values should not be clipped. The default beta-starts and beta-ends are also very much for images I believe. I do not know what do you think?

Hi there,

I've been trying to replicate the results reported in the paper for the TempFlow models, but have been having difficulty. Specifically, I've noticed that the flow-based models randomly sample from a Gaussian distribution when generating samples and that the models are sensitive to certain data, which may explain why some results are better than others and why some results differ from those reported in the paper. I was wondering if you have any insights on this issue?

Additionally, for the Real-NVP model in the traffic-nips dataset, I've set scaling and dequantize to False based on the paper's suggestion, but the CRPS_sum metric is around 0.25, which is significantly different from the results reported in the paper. Could you please suggest how I can set other hyperparameters to obtain similar results? I've submitted a new issue on the GitHub repository and sent an email to you, but I understand you may be busy. I would greatly appreciate any help you can provide.

Thank you for your time and assistance.

kashif commented 1 year ago

@hanlaoshi i copied over my traffic notebook here https://github.com/zalandoresearch/pytorch-ts/blob/version-0.7.0/examples/Traffic.ipynb but it needs to be fixed for the new API where there is no need to explicitly set the size of the resulting input vectors... i will try to find time to update it in the coming days

xinyaofan commented 1 year ago

@kashif Thanks so much for your prompt response! That makes much sense. It will be greatly appreciated if you could upload the old version of the code used for producing the results in the paper (including the diffusion models you implemented). BTW, if the start and end variance control values matter a lot, may I ask what range of values you think could be a good choice for the time series tasks (or it vary from task to task?). Thanks again!

kashif commented 1 year ago

i believe the beta start and end values in the image setting are due to the pixel variance in natural images... so in the time series setting it might be best to have them per dataset. I do not remember in the DDPM paper how they set the range, can you kindly check and we can then do that?

hanlaoshi commented 1 year ago

@hanlaoshi i copied over my traffic notebook here https://github.com/zalandoresearch/pytorch-ts/blob/version-0.7.0/examples/Traffic.ipynb but it needs to be fixed for the new API where there is no need to explicitly set the size of the resulting input vectors... i will try to find time to update it in the coming days

Thank you so much for your prompt and generous sharing of the hyperparameters! I really appreciate it.

xinyaofan commented 1 year ago

@kashif Thanks for your response. They set from 1e-04 to 0.1, the diffusion steps =100. beta are uniformly ranged in this interval.