yuqinie98 / PatchTST

An offical implementation of PatchTST: "A Time Series is Worth 64 Words: Long-term Forecasting with Transformers." (ICLR 2023) https://arxiv.org/abs/2211.14730
Apache License 2.0
1.51k stars 262 forks source link

how to use learner.distributed(), in self supervised pretrain code ? #96

Open lileishitou opened 9 months ago

lileishitou commented 9 months ago

how to use self supervised pretrain code to train on multiple GPUS or multi-node ?

I want to user revised the code for multinode or multiple GPUS for large dataset and large parameters.

But not successed