Nixtla / neuralforecast

Scalable and user friendly neural :brain: forecasting algorithms.
https://nixtlaverse.nixtla.io/neuralforecast
Apache License 2.0
2.7k stars 312 forks source link

TimeLLM takes a long time to setup training. #950

Closed hxuaj closed 2 months ago

hxuaj commented 3 months ago

What happened + What you expected to happen

Hi, I was tring to run the example code of the TimeLLM model https://nixtlaverse.nixtla.io/neuralforecast/models.timellm.html#timellm It took almost 1 hour before actual training. In the terminal, it only shows "Seed set to 1". I checked the GPU, where there is no GPU usage and only memory being taken about 500MB(~gpt2 size). Then the training began, it took only ~10s. In the training, there was usual GPU usage. Last, it also took ~1 hour to wrap up(predict time?). I was wondering if it's a normal situation for TimeLLM since the model is new. If it's a problem, where the bottleneck could possibly be? To exclude the network issue, I used local files to load GPT2:

gpt2_config = GPT2Config.from_pretrained(gpt2_local_path, local_files_only=True)
gpt2 = GPT2Model.from_pretrained(gpt2_local_path, config=gpt2_config, local_files_only=True)
gpt2_tokenizer = GPT2Tokenizer.from_pretrained(gpt2_local_path, local_files_only=True)

Hardware: NVIDIA T4 (only tried it on one of my GPUs due to https://github.com/Nixtla/neuralforecast/issues/937) OS: Linux

Versions / Dependencies

Python 3.9 neuralforecast 1.7.0

Reproduction script

https://nixtlaverse.nixtla.io/neuralforecast/models.timellm.html#timellm

Issue Severity

None

JKYtydt commented 3 months ago

发生了什么 + 你期望发生什么

你好, 我试图运行 TimeLLM 模型的示例代码https://nixtlaverse.nixtla.io/neuralforecast/models.timellm.html#timellm 在实际训练之前花了将近 1 小时。在终端中,它只显示“Seed set to 1”。我检查了 GPU,没有 GPU 使用情况,仅占用了大约 500MB(~gpt2 大小)的内存。然后训练开始,只花了~10s。在训练中,GPU 使用情况很常见。最后,也花了大约 1 个小时来结束(预测时间?)。 我想知道这是否是 TimeLLM 的正常情况,因为该模型是新的。如果有问题,瓶颈可能在哪里? 为了排除网络问题,我使用本地文件加载GPT2:

gpt2_config = GPT2Config.from_pretrained(gpt2_local_path, local_files_only=True)
gpt2 = GPT2Model.from_pretrained(gpt2_local_path, config=gpt2_config, local_files_only=True)
gpt2_tokenizer = GPT2Tokenizer.from_pretrained(gpt2_local_path, local_files_only=True)

硬件:NVIDIA T4(由于#937,仅在我的一个 GPU 上尝试过) 操作系统:Linux

版本/依赖项

Python 3.9 神经预测 1.7.0

复制脚本

https://nixtlaverse.nixtla.io/neuralforecast/models.timellm.html#timellm

问题严重性

没有任何

您好,我和您遇到了同样的问题,我把训练完的模型保存下来,再去做推理预测,依旧很慢,您这边是否解决了呢

elephaint commented 2 months ago

Thanks - I can reproduce the issue (very long time to setup the training). We'll look into it.

JKYtydt commented 2 months ago

谢谢 - 我可以重现这个问题(设置培训的时间很长)。我们会调查一下。

谢谢,期待您的回复

elephaint commented 2 months ago

I can't seem to find a solution for this, unfortunately. TimeLLM with the current model also seems slow on my machine. Maybe you could try a different LLM from the Transformers library?

hxuaj commented 2 months ago

I can't seem to find a solution for this, unfortunately. TimeLLM with the current model also seems slow on my machine. Maybe you could try a different LLM from the Transformers library?

Thank you for the reply. Do you mean apply LLMs other than gpt2 to TimeLLM? Since this issue, I switched to other models like nhits and timesnet already. Thanks again.

elephaint commented 2 months ago

I can't seem to find a solution for this, unfortunately. TimeLLM with the current model also seems slow on my machine. Maybe you could try a different LLM from the Transformers library?

Thank you for the reply. Do you mean apply LLMs other than gpt2 to TimeLLM? Since this issue, I switched to other models like nhits and timesnet already. Thanks again.

Yes, indeed, that's what I'd try. But I haven't tried myself different models yet, so can't recommend one, I'm sorry.

jexterliangsufe commented 2 weeks ago

So what causes this problem? I am facing the same problem as you did.