模型训练问题 - Githubissues

BlinkDL / RWKV-LM

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

Apache License 2.0

12.32k stars 838 forks source link

模型训练问题 #136

Closed wangkeke closed 1 year ago

wangkeke commented 1 year ago

首先，非常感谢作者的开源工作；然后我想要训练一个自己的中文模型，但是我在模型训练过程中遇到了一些问题：

当我使用rwkv-4-raven系列模型报错信息：
当我使用rwkv-4-pile-3b系列模型报错信息：

环境：python-3.10.6, torch==2.0.1+cu117（也试过torch==1.13.1+cu117同样报上面错误）

BlinkDL commented 1 year ago

请加入微调群 439087067 学习