Performance about self-supervised learning

Dear Authors, thanks for the inspiring work.

I have one question about the performance of self-supervised learning. I was wondering what is the loss for the self-supervised training you can obtain, and how the loss dependent on the dataset. For the downstream forecasting, how do different losses of self-supervised training indicate the performance of forecasting?

The reason I am asking is because, I tried pretraining with a different dataset, etth1. In this case, both training and validation loss do not decrease significantly. Just wondering what type of dataset could be used for pretraining that is beneficial for downstream forecasting.

Thanks a lot.

yuqinie98 / PatchTST

Performance about self-supervised learning #103