TimeLLM takes a long time to setup training.

hxuaj commented 3 months ago

What happened + What you expected to happen

Hi, I was tring to run the example code of the TimeLLM model https://nixtlaverse.nixtla.io/neuralforecast/models.timellm.html#timellm It took almost 1 hour before actual training. In the terminal, it only shows "Seed set to 1". I checked the GPU, where there is no GPU usage and only memory being taken about 500MB(~gpt2 size). Then the training began, it took only ~10s. In the training, there was usual GPU usage. Last, it also took ~1 hour to wrap up(predict time?). I was wondering if it's a normal situation for TimeLLM since the model is new. If it's a problem, where the bottleneck could possibly be? To exclude the network issue, I used local files to load GPT2:

gpt2_config = GPT2Config.from_pretrained(gpt2_local_path, local_files_only=True)
gpt2 = GPT2Model.from_pretrained(gpt2_local_path, config=gpt2_config, local_files_only=True)
gpt2_tokenizer = GPT2Tokenizer.from_pretrained(gpt2_local_path, local_files_only=True)

Hardware: NVIDIA T4 (only tried it on one of my GPUs due to https://github.com/Nixtla/neuralforecast/issues/937) OS: Linux

Versions / Dependencies

Python 3.9 neuralforecast 1.7.0

Reproduction script

https://nixtlaverse.nixtla.io/neuralforecast/models.timellm.html#timellm

Issue Severity

None

JKYtydt commented 3 months ago

发生了什么 + 你期望发生什么

你好，我试图运行 TimeLLM 模型的示例代码https://nixtlaverse.nixtla.io/neuralforecast/models.timellm.html#timellm 在实际训练之前花了将近 1 小时。在终端中，它只显示“Seed set to 1”。我检查了 GPU，没有 GPU 使用情况，仅占用了大约 500MB（~gpt2 大小）的内存。然后训练开始，只花了~10s。在训练中，GPU 使用情况很常见。最后，也花了大约 1 个小时来结束（预测时间？）。我想知道这是否是 TimeLLM 的正常情况，因为该模型是新的。如果有问题，瓶颈可能在哪里？为了排除网络问题，我使用本地文件加载GPT2：
gpt2_config = GPT2Config.from_pretrained(gpt2_local_path, local_files_only=True)
gpt2 = GPT2Model.from_pretrained(gpt2_local_path, config=gpt2_config, local_files_only=True)
gpt2_tokenizer = GPT2Tokenizer.from_pretrained(gpt2_local_path, local_files_only=True)
硬件：NVIDIA T4（由于#937，仅在我的一个 GPU 上尝试过）操作系统：Linux

版本/依赖项

Python 3.9 神经预测 1.7.0

复制脚本

https://nixtlaverse.nixtla.io/neuralforecast/models.timellm.html#timellm

问题严重性

没有任何

您好，我和您遇到了同样的问题，我把训练完的模型保存下来，再去做推理预测，依旧很慢，您这边是否解决了呢

elephaint commented 2 months ago

Thanks - I can reproduce the issue (very long time to setup the training). We'll look into it.

JKYtydt commented 2 months ago

谢谢 - 我可以重现这个问题（设置培训的时间很长）。我们会调查一下。

谢谢，期待您的回复

elephaint commented 2 months ago

I can't seem to find a solution for this, unfortunately. TimeLLM with the current model also seems slow on my machine. Maybe you could try a different LLM from the Transformers library?

hxuaj commented 2 months ago

I can't seem to find a solution for this, unfortunately. TimeLLM with the current model also seems slow on my machine. Maybe you could try a different LLM from the Transformers library?

Thank you for the reply. Do you mean apply LLMs other than gpt2 to TimeLLM? Since this issue, I switched to other models like nhits and timesnet already. Thanks again.

elephaint commented 2 months ago

I can't seem to find a solution for this, unfortunately. TimeLLM with the current model also seems slow on my machine. Maybe you could try a different LLM from the Transformers library?

Thank you for the reply. Do you mean apply LLMs other than gpt2 to TimeLLM? Since this issue, I switched to other models like nhits and timesnet already. Thanks again.

Yes, indeed, that's what I'd try. But I haven't tried myself different models yet, so can't recommend one, I'm sorry.

jexterliangsufe commented 2 weeks ago

So what causes this problem? I am facing the same problem as you did.

Nixtla / neuralforecast