Pre-trained model weights and gpu requirements

thuml / OpenLTM

Open-Source Implementations of Large Time-Series Models

MIT License

49 stars 4 forks source link

Open jexterliangsufe opened 4 days ago

jexterliangsufe commented 4 days ago

Great works! I have two questions about Timer-XL.

Will you open source the model weights which achieved the best performance in the paper?
I noticed the gpu you used is "NVIDIA A100 Tensor Core GPUs" in the paper. What are the minimum GPU requirements for pre-training this model? Is a single V100 sufficient?

WenWeiTHU commented 3 days ago

关于计算资源需求：在UTSD上预训练是在4张A100 40G上进行的（对应预训练规模大概是1B时间点训一天左右），如果是微调或者Superwised Training，用一张4090应该就够了。当然也可以根据GPU配置适当修改batch size。