Thank you for providing the source code of your models in Pytorch-TS. I was trying to reproduce the experimental results (right now mainly for TransformerMAF). I found some hyperparameter specifications in Section D.2 of your paper on arxiv (https://arxiv.org/pdf/2002.06103.pdf). I used those I could find in that section, and used default values for the others. But I couldn’t reproduce the results for most of the datasets.
For Solar, it didn’t finish training: in epoch 26, it had “nan” for the parameters of the distribution.
For the other datasets, the results are (much) worse, except for Exchange Rates.
I’d really appreciate it if you could provide your hyperparamer settings for all the datasets.
Also, regarding flow layers, you mentioned using ELU for activation, but checking the code, I noticed it was using ReLU, and ELU is not even an option. I wonder if that is important, or did ReLU actually tend to perform better?
Thank you for providing the source code of your models in Pytorch-TS. I was trying to reproduce the experimental results (right now mainly for TransformerMAF). I found some hyperparameter specifications in Section D.2 of your paper on arxiv (https://arxiv.org/pdf/2002.06103.pdf). I used those I could find in that section, and used default values for the others. But I couldn’t reproduce the results for most of the datasets.
I’d really appreciate it if you could provide your hyperparamer settings for all the datasets.
Also, regarding flow layers, you mentioned using ELU for activation, but checking the code, I noticed it was using ReLU, and ELU is not even an option. I wonder if that is important, or did ReLU actually tend to perform better?