RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
When I run the training script, I encounter the error "return WKV_6.apply(B, T, C, H, r, k, v, w, u) RuntimeError: invalid unordered_map<K, T> key." Could you please tell me what the reason might be?
Unfortunately I have never seen this error. Please try:
python 3.10, latest pytorch, cuda 12.3+, latest deepspeed, but keep pytorch-lightning==1.9.5
and use Ubuntu.
When I run the training script, I encounter the error "return WKV_6.apply(B, T, C, H, r, k, v, w, u) RuntimeError: invalid unordered_map<K, T> key." Could you please tell me what the reason might be?